Help for kernel module programming
Hi: I am writing a kernel module for assging an ip address to an interface. I have included linux/igmp.h but still whenever i use the function declared in igmp.h file, it says unresolved symbol for that function. I am new to this programming. i use the following command to compile it: gcc -c -D__KERNEL__ -DMODULE -I/home/newkernelsource/linux-2.4.22/include hello.c -- - - Regards, Prajakta Choudhari, Project Engineer, Networking and Internet Software Group, CDAC,Pune Email:[EMAIL PROTECTED] Mobile:9890302701 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-mm1: drivers/net/chelsio/: unused code
On Wed, 29 Nov 2006 08:36:09 +0100 Adrian Bunk <[EMAIL PROTECTED]> wrote: > On Mon, Nov 27, 2006 at 10:24:55AM -0800, Stephen Hemminger wrote: > > On Fri, 24 Nov 2006 01:17:31 +0100 > > Adrian Bunk <[EMAIL PROTECTED]> wrote: > > > > > On Thu, Nov 23, 2006 at 02:17:03AM -0800, Andrew Morton wrote: > > > >... > > > > Changes since 2.6.19-rc5-mm2: > > > >... > > > > +chelsio-22-driver.patch > > > >... > > > > netdev updates > > > > > > It is suspicious that the following newly added code is completely unused: > > > drivers/net/chelsio/ixf1010.o > > > t1_ixf1010_ops > > > drivers/net/chelsio/mac.o > > > t1_chelsio_mac_ops > > > drivers/net/chelsio/vsc8244.o > > > t1_vsc8244_ops > > > > > > cu > > > Adrian > > > > > > > All that is gone in later version. I reposted new patches > > after -mm2 was done. > > It seems these patches didn't make it into 2.6.19-rc6-mm2 ? > I dropped that patch and picked up Francois's tree instead. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] lockdep: fix sk->sk_callback_lock locking
Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > = > [ INFO: possible irq lock inversion dependency detected ] > 2.6.19-rc6 #4 > - > nc/1854 just changed the state of lock: > (af_callback_keys + sk->sk_family#2){-.-?}, at: [] > sock_def_error_report+0x1f/0x90 > but this lock was taken by another, soft-irq-safe lock in the past: > (slock-AF_INET){-+..} > > and interrupts could create inverse lock ordering between them. I think this is bogus. The slock is not a standard lock. When we hold it in process context we don't actually hold the spin lock part of it. However, it does prevent the softirq path from running in critical sections which also prevents any attempt to grab the callback lock from softirq context. If you still think there is a problem, please show an actual scenario where it dead locks. Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-mm1: drivers/net/chelsio/: unused code
On Mon, Nov 27, 2006 at 10:24:55AM -0800, Stephen Hemminger wrote: > On Fri, 24 Nov 2006 01:17:31 +0100 > Adrian Bunk <[EMAIL PROTECTED]> wrote: > > > On Thu, Nov 23, 2006 at 02:17:03AM -0800, Andrew Morton wrote: > > >... > > > Changes since 2.6.19-rc5-mm2: > > >... > > > +chelsio-22-driver.patch > > >... > > > netdev updates > > > > It is suspicious that the following newly added code is completely unused: > > drivers/net/chelsio/ixf1010.o > > t1_ixf1010_ops > > drivers/net/chelsio/mac.o > > t1_chelsio_mac_ops > > drivers/net/chelsio/vsc8244.o > > t1_vsc8244_ops > > > > cu > > Adrian > > > > All that is gone in later version. I reposted new patches > after -mm2 was done. It seems these patches didn't make it into 2.6.19-rc6-mm2 ? cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] Mark rdtsc as sync only for netburst, not for core2
Zhang, Yanmin wrote: If it's a single processor, the go backwards issue doesn't exist. Below is my patch based on Arjan's. It's against 2.6.19-rc5-mm2. Hi, this patch is incorrect --- linux-2.6.19-rc5-mm2_arjan/arch/x86_64/kernel/setup.c 2006-11-29 10:41:21.0 +0800 +++ linux-2.6.19-rc5-mm2_arjan_fix/arch/x86_64/kernel/setup.c 2006-11-29 10:42:28.0 +0800 @@ -861,7 +861,7 @@ static void __cpuinit init_intel(struct set_bit(X86_FEATURE_CONSTANT_TSC, >x86_capability); if (c->x86 == 6) set_bit(X86_FEATURE_REP_GOOD, >x86_capability); - if (c->x86 == 15) + if (c->x86 == 15 && num_possible_cpus() != 1) set_bit(X86_FEATURE_SYNC_RDTSC, >x86_capability); first of all, you probably meant "|| num_possible_cpus() == 1" but second of all, the core2 cpus are dual core so.. .what does it bring you at all? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
On 29-11-2006 05:25, David Miller wrote: ... > commit 93e3a20d6c67a09b867431e7d5b3e7bc97154fab > Author: David S. Miller <[EMAIL PROTECTED]> > Date: Tue Nov 28 20:24:10 2006 -0800 > > [NET]: Fix MAX_HEADER setting. > > MAX_HEADER is either set to LL_MAX_HEADER or LL_MAX_HEADER + 48, and > this is controlled by a set of CONFIG_* ifdef tests. ... > Noticed by Patrick McHardy. And if we talk about names: + Spotted by Krzysztof Halasa. probably wouldn't be too exaggerated... > Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-rt8: alsa xruns
* Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > I'll turn off the machine and cold boot it...) > > No difference, actually it looks like the regression re-regresses if I > enable the trace... Arghhh. yeah, that happens sometimes if some race is particularly narrow :-/ > Toggling /proc/sys/kernel/trace_enabled makes the long xruns reported > by jack come and go. i'll try to reproduce it. Can you see it with my yum kernel too? (that would simplify checking this on many testboxes) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.19-rc6-rt9 - fails to compile
Hi! 2.6.19-rc6-rt9 fails to compile on my Dual Core Notebook with FC6. CHK include/linux/version.h CHK include/linux/utsrelease.h CC arch/i386/kernel/asm-offsets.s In file included from include/linux/time.h:7, from include/linux/timex.h:57, from include/linux/sched.h:50, from include/linux/module.h:9, from include/linux/crypto.h:21, from arch/i386/kernel/asm-offsets.c:7: include/linux/seqlock.h: In function '__read_seqretry': include/linux/seqlock.h:139: error: expected expression before 'do' In file included from include/linux/module.h:9, from include/linux/crypto.h:21, from arch/i386/kernel/asm-offsets.c:7: include/linux/sched.h: In function 'dequeue_signal_lock': include/linux/sched.h:1478: error: expected expression before 'do' make[1]: *** [arch/i386/kernel/asm-offsets.s] Error 1 make: *** [prepare0] Error 2 www.marcush.de/kernel/.config Regards, Marcus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-rt8: alsa xruns
* Daniel Walker <[EMAIL PROTECTED]> wrote: > > i fixed this in -rt8: the latency tracer now uses the time of day > > clocksource - pmtimer in this case. (that means function tracing is > > slower than with the TSC, but latency figures are more reliable.) > > I have a patch set to make the using the clocksources a little nicer.. > Is there anything I should add to that interface to help enable > latency tracing, or are you satisfied with using the timekeeping > clocksource? It might get constrictive after a while. please talk to John and Thomas about GTOD interfaces. Right now the solution used by the latency tracer is working out pretty OK - but if something better comes along i can use that too. It's not a burning issue though, unless you know of some bug. (i'm not sure what you mean by it becoming constrictive) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] x86: unify/rewrite SMP TSC sync code
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > Andrew, > > could we try this one in -mm? It unifies (and simplifies) the TSC sync > code between i386 and x86_64, and also offers a stronger guarantee > that we'll only activate the TSC clock on CPU where the TSC is synced > correctly by the hardware. updated patch below. (Mike Galbraith reported that suspend broke on -rt kernels, it was due to an __init/__cpuinit mismatch) Ingo -> Subject: x86: rewrite SMP TSC sync code From: Ingo Molnar <[EMAIL PROTECTED]> make the TSC synchronization code more robust, and unify it between x86_64 and i386. The biggest change is the removal of the 'fix up TSCs' code on x86_64 and i386, in some rare cases it was /causing/ time-warps on SMP systems. The new code only checks for TSC asynchronity - and if it can prove a time-warp (if it can observe the TSC going backwards when going from one CPU to another within a critical section), then the TSC clock-source is turned off. The TSC synchronization-checking code also got moved into a separate file. Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- arch/i386/kernel/Makefile |2 arch/i386/kernel/smpboot.c| 178 ++-- arch/i386/kernel/tsc.c|4 arch/i386/kernel/tsc_sync.c |1 arch/x86_64/kernel/Makefile |2 arch/x86_64/kernel/smpboot.c | 230 ++ arch/x86_64/kernel/time.c | 11 ++ arch/x86_64/kernel/tsc_sync.c | 187 ++ include/asm-i386/tsc.h| 49 include/asm-x86_64/proto.h|2 include/asm-x86_64/timex.h| 26 include/asm-x86_64/tsc.h | 66 12 files changed, 295 insertions(+), 463 deletions(-) Index: linux/arch/i386/kernel/Makefile === --- linux.orig/arch/i386/kernel/Makefile +++ linux/arch/i386/kernel/Makefile @@ -18,7 +18,7 @@ obj-$(CONFIG_X86_MSR) += msr.o obj-$(CONFIG_X86_CPUID)+= cpuid.o obj-$(CONFIG_MICROCODE)+= microcode.o obj-$(CONFIG_APM) += apm.o -obj-$(CONFIG_X86_SMP) += smp.o smpboot.o +obj-$(CONFIG_X86_SMP) += smp.o smpboot.o tsc_sync.o obj-$(CONFIG_X86_TRAMPOLINE) += trampoline.o obj-$(CONFIG_X86_MPPARSE) += mpparse.o obj-$(CONFIG_X86_LOCAL_APIC) += apic.o nmi.o Index: linux/arch/i386/kernel/smpboot.c === --- linux.orig/arch/i386/kernel/smpboot.c +++ linux/arch/i386/kernel/smpboot.c @@ -88,12 +88,6 @@ cpumask_t cpu_possible_map; EXPORT_SYMBOL(cpu_possible_map); static cpumask_t smp_commenced_mask; -/* TSC's upper 32 bits can't be written in eariler CPU (before prescott), there - * is no way to resync one AP against BP. TBD: for prescott and above, we - * should use IA64's algorithm - */ -static int __devinitdata tsc_sync_disabled; - /* Per CPU bogomips and other parameters */ struct cpuinfo_x86 cpu_data[NR_CPUS] __cacheline_aligned; EXPORT_SYMBOL(cpu_data); @@ -210,151 +204,6 @@ valid_k7: ; } -/* - * TSC synchronization. - * - * We first check whether all CPUs have their TSC's synchronized, - * then we print a warning if not, and always resync. - */ - -static struct { - atomic_t start_flag; - atomic_t count_start; - atomic_t count_stop; - unsigned long long values[NR_CPUS]; -} tsc __initdata = { - .start_flag = ATOMIC_INIT(0), - .count_start = ATOMIC_INIT(0), - .count_stop = ATOMIC_INIT(0), -}; - -#define NR_LOOPS 5 - -static void __init synchronize_tsc_bp(void) -{ - int i; - unsigned long long t0; - unsigned long long sum, avg; - long long delta; - unsigned int one_usec; - int buggy = 0; - - printk(KERN_INFO "checking TSC synchronization across %u CPUs: ", num_booting_cpus()); - - /* convert from kcyc/sec to cyc/usec */ - one_usec = cpu_khz / 1000; - - atomic_set(_flag, 1); - wmb(); - - /* -* We loop a few times to get a primed instruction cache, -* then the last pass is more or less synchronized and -* the BP and APs set their cycle counters to zero all at -* once. This reduces the chance of having random offsets -* between the processors, and guarantees that the maximum -* delay between the cycle counters is never bigger than -* the latency of information-passing (cachelines) between -* two CPUs. -*/ - for (i = 0; i < NR_LOOPS; i++) { - /* -* all APs synchronize but they loop on '== num_cpus' -*/ - while (atomic_read(_start) != num_booting_cpus()-1) - cpu_relax(); - atomic_set(_stop, 0); - wmb(); - /* -* this lets the APs save their current TSC: -*/ -
Re: The VFS cache is not freed when there is not enough free memory to allocate
Forward to the mailing list. Sonic Zhang wrote: On 11/27/06, Nick Piggin <[EMAIL PROTECTED]> wrote: I haven't actually written any nommu userspace code, but it is obvious that you must try to keep malloc to <= PAGE_SIZE (although order 2 and even 3 allocations seem to be reasonable, from process context)... Then you would use something a bit more advanced than a linear array to store data (a pagetable-like radix tree would be a nice, easy idea). But, even we split the 8M memory into 2048 x 4k blocks, we still face this failure. The key problem is that available memory is small than 2048 x 4k, while there are still a lot of VFS cache. The VFS cache can be freed, but kernel allocation function ignores it. See the new test application. Which kernel allocation function? If you can provide more details I'd like to get to the bottom of this. Because the anonymous memory allocation in mm/nommu.c is all allocated with GFP_KERNEL from process context, and in that case, the allocator should not fail but call into page reclaim which in turn will free VFS caches. What's a better way to free the VFS cache in memory allocator? It should be freeing it for you, so I'm not quite sure what is going on. Can you send over the kernel messages you see when the allocation fails? Also, do you happen to know of a reasonable toolchain + emulator setup that I could test the nommu kernel with? Thanks, Nick - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] genapic: default to physical mode on hotplug CPU kernels
* Siddha, Suresh B <[EMAIL PROTECTED]> wrote: > On Tue, Nov 28, 2006 at 09:23:22PM +0100, Ingo Molnar wrote: > > > > * Siddha, Suresh B <[EMAIL PROTECTED]> wrote: > > > > > On Tue, Nov 28, 2006 at 07:33:46AM +0100, Ingo Molnar wrote: > > > > - if (clusters <= 1 && max_cluster <= 8 && cluster_cnt[0] == > > > > max_cluster) > > > > + if (max_apic < 8) > > > > > > Patch mostly looks good. Instead of checking for max_apic, can we use > > > cpus_weight(cpu_possible_map) <= 8 > > > > ok - but i think it's still possible the BIOS tells us APIC IDs that are > > larger than 7, even if there are fewer CPUs. So i think the patch below > > should cover it. Agreed? > > > > I think it is ok to use flat mode even when APIC IDs are larger than > 7, as we rely on LDR's which are programmed using smp_processor_id(). > > IMO, cpus_weight check should be fine. hm - indeed. Then we can indeed do the patch below. Nice simplification! Ingo > From: Ingo Molnar <[EMAIL PROTECTED]> Subject: [patch] genapic: default to physical mode on hotplug CPU kernels default to physical mode on hotplug CPU kernels. Furher simplify and clean up the APIC initialization code. Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- arch/x86_64/kernel/genapic.c | 20 +++- arch/x86_64/kernel/mpparse.c |2 +- include/asm-x86_64/apic.h|2 +- 3 files changed, 5 insertions(+), 19 deletions(-) Index: linux/arch/x86_64/kernel/genapic.c === --- linux.orig/arch/x86_64/kernel/genapic.c +++ linux/arch/x86_64/kernel/genapic.c @@ -33,25 +33,11 @@ u8 x86_cpu_to_log_apicid[NR_CPUS] = { [0 struct genapic __read_mostly *genapic = _flat; /* - * Check the APIC IDs in bios_cpu_apicid and choose the APIC mode. + * Choose the APIC routing mode: */ -void __init clustered_apic_check(void) +void __init setup_apic_routing(void) { - unsigned int i, max_apic = 0; - u8 id; - - /* -* Determine the maximum APIC ID in use: -*/ - for (i = 0; i < NR_CPUS; i++) { - id = bios_cpu_apicid[i]; - if (id == BAD_APICID) - continue; - if (id > max_apic) - max_apic = id; - } - - if (max_apic < 8) + if (cpus_weight(cpu_possible_map) <= 8) genapic = _flat; else genapic = _physflat; Index: linux/arch/x86_64/kernel/mpparse.c === --- linux.orig/arch/x86_64/kernel/mpparse.c +++ linux/arch/x86_64/kernel/mpparse.c @@ -302,7 +302,7 @@ static int __init smp_read_mpc(struct mp } } } - clustered_apic_check(); + setup_apic_routing(); if (!num_processors) printk(KERN_ERR "MPTABLE: no processors registered!\n"); return num_processors; Index: linux/include/asm-x86_64/apic.h === --- linux.orig/include/asm-x86_64/apic.h +++ linux/include/asm-x86_64/apic.h @@ -82,7 +82,7 @@ extern void setup_secondary_APIC_clock ( extern int APIC_init_uniprocessor (void); extern void disable_APIC_timer(void); extern void enable_APIC_timer(void); -extern void clustered_apic_check(void); +extern void setup_apic_routing(void); static inline void lapic_timer_idle_broadcast(int broadcast) { } extern void setup_APIC_extened_lvt(unsigned char lvt_off, unsigned char vector, - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-rt8
* Karsten Wiese <[EMAIL PROTECTED]> wrote: > After estimated 15 minutes more it bugged again. > Related dmesg translates to linux error > -EXDEV > propably caused by the following lines: > > > static int uhci_result_isochronous(struct uhci_hcd *uhci, struct urb *urb) hm. Below are all the USB changes done by -rt. Maybe one of them has some side-effect? Ingo Index: linux/drivers/usb/core/devio.c === --- linux.orig/drivers/usb/core/devio.c +++ linux/drivers/usb/core/devio.c @@ -309,10 +309,11 @@ static void async_completed(struct urb * struct async *as = urb->context; struct dev_state *ps = as->ps; struct siginfo sinfo; + unsigned long flags; -spin_lock(>lock); -list_move_tail(>asynclist, >async_completed); -spin_unlock(>lock); + spin_lock_irqsave(>lock, flags); + list_move_tail(>asynclist, >async_completed); + spin_unlock_irqrestore(>lock, flags); if (as->signr) { sinfo.si_signo = as->signr; sinfo.si_errno = as->urb->status; Index: linux/drivers/usb/core/hcd.c === --- linux.orig/drivers/usb/core/hcd.c +++ linux/drivers/usb/core/hcd.c @@ -517,13 +517,11 @@ error: } /* any errors get returned through the urb completion */ - local_irq_save (flags); - spin_lock (>lock); + spin_lock_irqsave(>lock, flags); if (urb->status == -EINPROGRESS) urb->status = status; - spin_unlock (>lock); + spin_unlock_irqrestore(>lock, flags); usb_hcd_giveback_urb (hcd, urb); - local_irq_restore (flags); return 0; } @@ -551,8 +549,7 @@ void usb_hcd_poll_rh_status(struct usb_h if (length > 0) { /* try to complete the status urb */ - local_irq_save (flags); - spin_lock(_root_hub_lock); + spin_lock_irqsave(_root_hub_lock, flags); urb = hcd->status_urb; if (urb) { spin_lock(>lock); @@ -568,14 +565,13 @@ void usb_hcd_poll_rh_status(struct usb_h spin_unlock(>lock); } else length = 0; - spin_unlock(_root_hub_lock); + spin_unlock_irqrestore(_root_hub_lock, flags); /* local irqs are always blocked in completions */ if (length > 0) usb_hcd_giveback_urb (hcd, urb); else hcd->poll_pending = 1; - local_irq_restore (flags); } /* The USB 2.0 spec says 256 ms. This is close enough and won't @@ -647,17 +643,15 @@ static int usb_rh_urb_dequeue (struct us } else {/* Status URB */ if (!hcd->uses_new_polling) del_timer (>rh_timer); - local_irq_save (flags); - spin_lock (_root_hub_lock); + spin_lock_irqsave(_root_hub_lock, flags); if (urb == hcd->status_urb) { hcd->status_urb = NULL; urb->hcpriv = NULL; } else urb = NULL; /* wasn't fully queued */ - spin_unlock (_root_hub_lock); + spin_unlock_irqrestore(_root_hub_lock, flags); if (urb) usb_hcd_giveback_urb (hcd, urb); - local_irq_restore (flags); } return 0; @@ -1311,11 +1305,9 @@ void usb_hcd_endpoint_disable (struct us WARN_ON (!HC_IS_RUNNING (hcd->state) && hcd->state != HC_STATE_HALT && udev->state != USB_STATE_NOTATTACHED); - local_irq_disable (); - /* ep is already gone from udev->ep_{in,out}[]; no more submits */ rescan: - spin_lock (_data_lock); + spin_lock_irq(_data_lock); list_for_each_entry (urb, >urb_list, urb_list) { int tmp; @@ -1323,13 +1315,13 @@ rescan: if (urb->status != -EINPROGRESS) continue; usb_get_urb (urb); - spin_unlock (_data_lock); + spin_unlock_irq(_data_lock); - spin_lock (>lock); + spin_lock_irq(>lock); tmp = urb->status; if (tmp == -EINPROGRESS) urb->status = -ESHUTDOWN; - spin_unlock (>lock); + spin_unlock_irq(>lock); /* kick hcd unless it's already returning this */ if (tmp == -EINPROGRESS) { @@ -1352,8 +1344,7 @@ rescan: /* list contents may have changed */ goto rescan; } - spin_unlock (_data_lock); - local_irq_enable (); + spin_unlock_irq(_data_lock); /*
Re: 2.6.19-rc6-rt8
* Hu Gang <[EMAIL PROTECTED]> wrote: > > thanks, applied. I'll let the PPC -rt folks sort out the hack effects. > > Do you have CONFIG_HIGH_RES_TIMERS enabled? > no. > > > [hugang@:~]$ uname -a > Linux hugang.soulinfo.com 2.6.19-rc6-rt8 #2 PREEMPT Wed Nov 29 09:29:43 UTC > 2006 ppc GNU/Linux > [hugang@:~]$ zgrep CONFIG_HIGH_RES_TIMERS /proc/config.gz > [hugang@:~]$ could you send me your config? (i'm just curious what else is enabled/disabled) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-rt8
* Karsten Wiese <[EMAIL PROTECTED]> wrote: > Am Montag, 27. November 2006 10:49 schrieb Ingo Molnar: > > i have released the 2.6.19-rc6-rt8 tree, which can be downloaded from > > I saw usb transport errors here before rebooting with > nmi_watchdog=0 > contained in kernel command line. so nmi_watchdog=1 (or was it nmi_watchdog=2 ?) caused these problems - and then nmi_watchdog=0 fixed them? i686? Extremely weird. Does the patch below fix the issue perhaps? Ingo Index: linux/arch/i386/kernel/nmi.c === --- linux.orig/arch/i386/kernel/nmi.c +++ linux/arch/i386/kernel/nmi.c @@ -932,12 +932,14 @@ notrace __kprobes int nmi_watchdog_tick( __profile_tick(CPU_PROFILING, regs); +#if 0 /* check for other users first */ if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT) == NOTIFY_STOP) { rc = 1; touched = 1; } +#endif /* * Take the local apic timer and PIT/HPET into account. We don't Index: linux/arch/x86_64/kernel/nmi.c === --- linux.orig/arch/x86_64/kernel/nmi.c +++ linux/arch/x86_64/kernel/nmi.c @@ -814,12 +814,14 @@ int __kprobes nmi_watchdog_tick(struct p __profile_tick(CPU_PROFILING, regs); +#if 0 /* check for other users first */ if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT) == NOTIFY_STOP) { rc = 1; touched = 1; } +#endif sum = read_pda(apic_timer_irqs); if (nmi_show_regs[cpu]) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-rt5
* Mark Knecht <[EMAIL PROTECTED]> wrote: > Forwarding it off list. > > Thanks Ingo. I'm very interested if it works for you to do this. i've integrated it into -rt (see the patch below), but i marked it obsolete and i might not be able to carry it for long - we'll see. The preferred solution is to use newer PAM and its rt-limits features. But to ease migration i'll keep the realtime-lsm for a while. Ingo --- security/Kconfig |9 +++ security/Makefile |1 security/realcap.c | 147 + 3 files changed, 157 insertions(+) Index: linux/security/Kconfig === --- linux.orig/security/Kconfig +++ linux/security/Kconfig @@ -80,6 +80,15 @@ config SECURITY_CAPABILITIES This enables the "default" Linux capabilities functionality. If you are unsure how to answer this question, answer Y. +config REALTIME_CAPABILITIES + tristate "Real-Time LSM (Obsolete)" + depends on SECURITY && EXPERIMENTAL + help + This is an obsolete LSM - use newer PAM and rt-limites + to manage your real-time apps. + + If you are unsure how to answer this question, answer N. + config SECURITY_ROOTPLUG tristate "Root Plug Support" depends on USB && SECURITY Index: linux/security/Makefile === --- linux.orig/security/Makefile +++ linux/security/Makefile @@ -15,4 +15,5 @@ obj-$(CONFIG_SECURITY)+= security.o d # Must precede capability.o in order to stack properly. obj-$(CONFIG_SECURITY_SELINUX) += selinux/built-in.o obj-$(CONFIG_SECURITY_CAPABILITIES)+= commoncap.o capability.o +obj-$(COMMON_REALTIME_CAPABILITIES)+= commoncap.o realcap.o obj-$(CONFIG_SECURITY_ROOTPLUG)+= commoncap.o root_plug.o Index: linux/security/realcap.c === --- /dev/null +++ linux/security/realcap.c @@ -0,0 +1,147 @@ +/* + * Realtime Capabilities Linux Security Module + * + * Copyright (C) 2003 Torben Hohn + * Copyright (C) 2003, 2004 Jack O'Quin + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + */ + +#include +#include + +#define RT_LSM "Realtime LSM " /* syslog module name prefix */ +#define RT_ERR "Realtime: "/* syslog error message prefix */ + +#include +MODULE_INFO(vermagic,VERMAGIC_STRING); + +/* module parameters + * + * These values could change at any time due to some process writing + * a new value in /sys/module/realtime/parameters. This is OK, + * because each is referenced only once in each function call. + * Nothing depends on parameters having the same value every time. + */ + +/* if TRUE, any process is realtime */ +static int rt_any; +module_param_named(any, rt_any, int, 0644); +MODULE_PARM_DESC(any, " grant realtime privileges to any process."); + +/* realtime group id, or NO_GROUP */ +static int rt_gid = -1; +module_param_named(gid, rt_gid, int, 0644); +MODULE_PARM_DESC(gid, " the group ID with access to realtime privileges."); + +/* enable mlock() privileges */ +static int rt_mlock = 1; +module_param_named(mlock, rt_mlock, int, 0644); +MODULE_PARM_DESC(mlock, " enable memory locking privileges."); + +/* helper function for testing group membership */ +static inline int gid_ok(int gid) +{ + if (gid == -1) + return 0; + + if (gid == current->gid) + return 1; + + return in_egroup_p(gid); +} + +static void realtime_bprm_apply_creds(struct linux_binprm *bprm, int unsafe) +{ + cap_bprm_apply_creds(bprm, unsafe); + + /* If a non-zero `any' parameter was specified, we grant +* realtime privileges to every process. If the `gid' +* parameter was specified and it matches the group id of the +* executable, of the current process or any supplementary +* groups, we grant realtime capabilites. +*/ + + if (rt_any || gid_ok(rt_gid)) { + cap_raise(current->cap_effective, CAP_SYS_NICE); + if (rt_mlock) { + cap_raise(current->cap_effective, CAP_IPC_LOCK); + cap_raise(current->cap_effective, CAP_SYS_RESOURCE); + } + } +} + +static struct security_operations capability_ops = { + .ptrace = cap_ptrace, + .capget = cap_capget, + .capset_check = cap_capset_check, + .capset_set = cap_capset_set, + .capable = cap_capable, + .netlink_send = cap_netlink_send, + .netlink_recv
[PATCH] x86_64: check vector in setup_ioapic_dest to verify if need setup_IO_APIC_irq
plesae check the patch [PATCH] x86_64: check vector in setup_ioapic_dest to verify if need setup_IO_APIC_irq setup_IO_APIC_irqs could fail to get vector for some device when you have too many devices, because at that time only boot cpu is online. So check vector for irq in setup_ioapic_dest and call setup_IO_APIC_irq to make sure IO-APIC irq-routing table is initialized. Also seperate setup_IO_APIC_irq from setup_IO_APIC_irqs. Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]> diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c index 14654e6..496ba4e 100644 --- a/arch/x86_64/kernel/io_apic.c +++ b/arch/x86_64/kernel/io_apic.c @@ -796,27 +845,65 @@ static void ioapic_register_intr(int irq handle_edge_irq, "edge"); } } - -static void __init setup_IO_APIC_irqs(void) +static void __init setup_IO_APIC_irq(int apic, int pin, int idx, int irq) { struct IO_APIC_route_entry entry; - int apic, pin, idx, irq, first_notcon = 1, vector; + int vector; unsigned long flags; - apic_printk(APIC_VERBOSE, KERN_DEBUG "init IO_APIC IRQs\n"); - for (apic = 0; apic < nr_ioapics; apic++) { - for (pin = 0; pin < nr_ioapic_registers[apic]; pin++) { + /* + * add it to the IO-APIC irq-routing table: + */ + memset(,0,sizeof(entry)); - /* - * add it to the IO-APIC irq-routing table: - */ - memset(,0,sizeof(entry)); + entry.delivery_mode = INT_DELIVERY_MODE; + entry.dest_mode = INT_DEST_MODE; + entry.mask = 0;/* enable IRQ */ + entry.dest.logical.logical_dest = cpu_mask_to_apicid(TARGET_CPUS); + + entry.trigger = irq_trigger(idx); + entry.polarity = irq_polarity(idx); - entry.delivery_mode = INT_DELIVERY_MODE; - entry.dest_mode = INT_DEST_MODE; - entry.mask = 0;/* enable IRQ */ + if (irq_trigger(idx)) { + entry.trigger = 1; + entry.mask = 1; entry.dest.logical.logical_dest = cpu_mask_to_apicid(TARGET_CPUS); + } + + if (!apic && !IO_APIC_IRQ(irq)) + return; + + if (IO_APIC_IRQ(irq)) { + cpumask_t mask; + vector = assign_irq_vector(irq, TARGET_CPUS, ); + if (vector < 0) + return; + + entry.dest.logical.logical_dest = cpu_mask_to_apicid(mask); + entry.vector = vector; + + ioapic_register_intr(irq, vector, IOAPIC_AUTO); + if (!apic && (irq < 16)) + disable_8259A_irq(irq); + } + + ioapic_write_entry(apic, pin, entry); + + spin_lock_irqsave(_lock, flags); + set_native_irq_info(irq, TARGET_CPUS); + spin_unlock_irqrestore(_lock, flags); + +} + +static void __init setup_IO_APIC_irqs(void) +{ + int apic, pin, idx, irq, first_notcon = 1; + + apic_printk(APIC_VERBOSE, KERN_DEBUG "init IO_APIC IRQs\n"); + + for (apic = 0; apic < nr_ioapics; apic++) { + for (pin = 0; pin < nr_ioapic_registers[apic]; pin++) { idx = find_irq_entry(apic,pin,mp_INT); if (idx == -1) { @@ -828,39 +915,11 @@ static void __init setup_IO_APIC_irqs(vo continue; } - entry.trigger = irq_trigger(idx); - entry.polarity = irq_polarity(idx); - - if (irq_trigger(idx)) { - entry.trigger = 1; - entry.mask = 1; - entry.dest.logical.logical_dest = cpu_mask_to_apicid(TARGET_CPUS); - } - irq = pin_2_irq(idx, apic, pin); add_pin_to_irq(irq, apic, pin); - if (!apic && !IO_APIC_IRQ(irq)) - continue; - - if (IO_APIC_IRQ(irq)) { - cpumask_t mask; - vector = assign_irq_vector(irq, TARGET_CPUS, ); - if (vector < 0) -continue; + setup_IO_APIC_irq(apic, pin, idx, irq); - entry.dest.logical.logical_dest = cpu_mask_to_apicid(mask); - entry.vector = vector; - - ioapic_register_intr(irq, vector, IOAPIC_AUTO); - if (!apic && (irq < 16)) -disable_8259A_irq(irq); - } - ioapic_write_entry(apic, pin, entry); - - spin_lock_irqsave(_lock, flags); - set_native_irq_info(irq, TARGET_CPUS); - spin_unlock_irqrestore(_lock, flags); } } @@ -2141,7 +2200,15 @@ void __init setup_ioapic_dest(void) if (irq_entry == -1) continue; irq = pin_2_irq(irq_entry, ioapic, pin); - set_ioapic_affinity_irq(irq, TARGET_CPUS); + + /* setup_IO_APIC_irqs could fail to get vector for some device + * when you have too many devices, because at that time only boot + * cpu is online. + */ + if(!irq_vector[irq]) +setup_IO_APIC_irq(ioapic, pin, irq_entry, irq); + else +set_ioapic_affinity_irq(irq, TARGET_CPUS); } }
Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
On Tue, Nov 28, 2006 at 09:04:16PM -0800, David Miller wrote: > > > Definitely. I'm not sure whether 48 is enough even for recursive > > tunnels. This should really just be a hint. It's OK to spend a > > bit of time reallocating skb's if it's too small, but it's not OK > > to die. > > The recursive tunnel case is handled by the PMTU reductions > in the route, isn't it? Oh I wasn't suggesting that the current code is broken. I'm just emphasising that LL_MAX_HEADER is by no means the *maximum* header size in a Linux system. Anybody should be able to load a new NIC module with a hard header size exceeding what LL_MAX_HEADER is and the system should still function (albeit slower since every packet sent down that device has to be reallocated). In particular, nested tunnels is one such device which anybody can construct without writing a kernel module. As to getting rid of those ifdefs, here is one idea. We keep a read-mostly global variable that represents the actual current maximum LL header size. Everytime a new device appears (or if its hard header size changes) we update this variable if needed. Hmm, we don't actually update the hard header size should the underlying device change for tunnels. Good thing the tunnels only use that as a hint and reallocate if necessary :) This is not optimal in that it never decreases, but it's certainly better than a compile-time constant (e.g., people using distribution kernels don't necessarily use tunnels). Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-rt8
On Wed, 29 Nov 2006 07:41:09 +0100 Ingo Molnar <[EMAIL PROTECTED]> wrote: > > * Hu Gang <[EMAIL PROTECTED]> wrote: > > > On Mon, 27 Nov 2006 10:49:27 +0100 > > Ingo Molnar <[EMAIL PROTECTED]> wrote: > > > > > i have released the 2.6.19-rc6-rt8 tree, which can be downloaded from > > > the usual place: > > > > > > http://redhat.com/~mingo/realtime-preempt/ > > > > attached patch to making it compile and works in my PowerBook G4. > > thanks, applied. I'll let the PPC -rt folks sort out the hack effects. > Do you have CONFIG_HIGH_RES_TIMERS enabled? no. [hugang@:~]$ uname -a Linux hugang.soulinfo.com 2.6.19-rc6-rt8 #2 PREEMPT Wed Nov 29 09:29:43 UTC 2006 ppc GNU/Linux [hugang@:~]$ zgrep CONFIG_HIGH_RES_TIMERS /proc/config.gz [hugang@:~]$ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-rt8
* Hu Gang <[EMAIL PROTECTED]> wrote: > On Mon, 27 Nov 2006 10:49:27 +0100 > Ingo Molnar <[EMAIL PROTECTED]> wrote: > > > i have released the 2.6.19-rc6-rt8 tree, which can be downloaded from > > the usual place: > > > > http://redhat.com/~mingo/realtime-preempt/ > > attached patch to making it compile and works in my PowerBook G4. thanks, applied. I'll let the PPC -rt folks sort out the hack effects. Do you have CONFIG_HIGH_RES_TIMERS enabled? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Slab: Remove kmem_cache_t
Linus Torvalds wrote: So typedefs are good for - "u8"/"u16"/"u32"/"u64" kind of things, where the underlying types really are potentially different on different architectures. - "sector_t"-like things which may be 32-bit or 64-bit depending on some CONFIG_LBD option or other. - as a special case, "sparse" actually makes bitwise typedefs have real meaning as types, so if you are using sparse to distinguish between a little-endian 16-bit entity or a big-endian 16-bit entity, the typedef there is actually important and has real meaning to sparse (without the typedef, each bitwise type declaration would be strictly a _different_ type from another bitwise type declaration that otherwise looks the same). But typedefs are NOT good for: - trying to avoid typing a few characters: "kmem_cache_t" is strictly _worse_ than "struct kmem_cache", not just because it causes declaration issues. It also hides the fact that the thing really is a structure (and hiding the fact that it's a pointer is a shooting offense: things like "voidptr_t" should not be allowed at all) - incorrect "portability". the POSIX "socklen_t" was not only a really bad way to write "int", it actually caused a lot of NON-portability, and made some people think it should be "size_t" or something equally broken. The one excuse for typedefs in the "typing" sense can be complicated function pointer types. Some function pointers are just too easy to screw up, and using a typedef (*myfn_t)(int, ...); can be preferable over forcing people to write that really complex kind of type out every time. But that shouldn't be overused either (but we use it for things like "readdir_t", for example, for exactly this reason). You are saying that they should only be used to create new "primitive" types (ie. that you can use in arithmetic / logical ops) that can change depending on the config. That's fair enough. I'm sure you've also said in the past that they can be used (IIRC you even encouraged it) when the type is opaque in the context it is being used. I won't bother trying to dig out the post, because I could be wrong and you are entitled to change your mind. I just want to get this straight. Thanks, Nick -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm] ALSA: add struct forward declaration
From: Randy Dunlap <[EMAIL PROTECTED]> I see about 10 sets of these in a random config. CC drivers/media/video/saa7134/saa7134-cards.o In file included from drivers/media/video/saa7134/saa7134.h:43, from drivers/media/video/saa7134/saa7134-cards.c:27: include/sound/pcm.h:59: warning: 'struct snd_pcm_substream' declared inside parameter list include/sound/pcm.h:59: warning: its scope is only this definition or declaration, which is probably not what you want include/sound/pcm.h:60: warning: 'struct snd_pcm_substream' declared inside parameter list include/sound/pcm.h:62: warning: 'struct snd_pcm_substream' declared inside parameter list include/sound/pcm.h:64: warning: 'struct snd_pcm_substream' declared inside parameter list include/sound/pcm.h:65: warning: 'struct snd_pcm_substream' declared inside parameter list include/sound/pcm.h:66: warning: 'struct snd_pcm_substream' declared inside parameter list include/sound/pcm.h:67: warning: 'struct snd_pcm_substream' declared inside parameter list include/sound/pcm.h:68: warning: 'struct snd_pcm_substream' declared inside parameter list include/sound/pcm.h:71: warning: 'struct snd_pcm_substream' declared inside parameter list include/sound/pcm.h:73: warning: 'struct snd_pcm_substream' declared inside parameter list include/sound/pcm.h:75: warning: 'struct snd_pcm_substream' declared inside parameter list include/sound/pcm.h:76: warning: 'struct snd_pcm_substream' declared inside parameter list include/sound/pcm.h:77: warning: 'struct snd_pcm_substream' declared inside parameter list Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> --- include/sound/pcm.h |2 ++ 1 file changed, 2 insertions(+) --- linux-2.6.19-rc6-mm2.orig/include/sound/pcm.h +++ linux-2.6.19-rc6-mm2/include/sound/pcm.h @@ -55,6 +55,8 @@ struct snd_pcm_hardware { size_t fifo_size; /* fifo size in bytes */ }; +struct snd_pcm_substream; + struct snd_pcm_ops { int (*open)(struct snd_pcm_substream *substream); int (*close)(struct snd_pcm_substream *substream); --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
The return of the dreaded "nobody cared" message with a Promise Card
Hi, Andrew, hi Alan, hi others. First of all, I would kindly ask you that you keep me in the Cc'ed messages. I'm currently finishing grades of (loads and loads) of students and I'm having a hard time keeping up with my health problems and real life work, let alone the traffic of lkml. Well, let me get straight to the problem. I have an Asus A7V (Classic) motherboard with a VIA KT133 chipset and it has the two usual VIA IDE controllers and two extra Promise PDC20265 controllers. Right now, my setup is the following (given advice that once Alan gave me, but he may not recall it): * hda: DVD+-RW burner; * hdc: Plain CD-ROM reader; * hde: Seagate ST3160021A (7200.??) drive; * hdg: QUANTUM FIREBALLlct15 30 drive. The problem is that whenever I plug the Quantum drive, I get stack traces like this one (with a bit of context, so that you can get sense of what I am talking about): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ide1 at 0x170-0x177,0x376 on irq 15 PDC20265: IDE controller at PCI slot :00:11.0 ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10 PCI: setting IRQ 10 as level-triggered ACPI: PCI Interrupt :00:11.0[A] -> Link [LNKB] -> GSI 10 (level, low) -> IRQ 10 PDC20265: chipset revision 2 PDC20265: ROM enabled at 0x3000 PDC20265: 100% native mode on irq 10 PDC20265: (U)DMA Burst Bit ENABLED Primary PCI Mode Secondary PCI Mode. ide2: BM-DMA at 0x7400-0x7407, BIOS settings: hde:pio, hdf:pio ide3: BM-DMA at 0x7408-0x740f, BIOS settings: hdg:DMA, hdh:pio Probing IDE interface ide2... hde: ST3160021A, ATA DISK drive ide2 at 0x8800-0x8807,0x8402 on irq 10 Probing IDE interface ide3... hdg: QUANTUM FIREBALLlct15 30, ATA DISK drive irq 10: nobody cared (try booting with the "irqpoll" option) [] show_trace_log_lvl+0x58/0x16a [] show_trace+0xd/0x10 [] dump_stack+0x19/0x1b [] __report_bad_irq+0x2e/0x6f [] note_interrupt+0x19f/0x1d5 [] __do_IRQ+0xb5/0xeb [] do_IRQ+0x67/0x86 [] common_interrupt+0x1a/0x20 DWARF2 unwinder stuck at common_interrupt+0x1a/0x20 Leftover inexact backtrace: === handlers: [] (ide_intr+0x0/0x19b) Disabling IRQ #10 Warning: Secondary channel requires an 80-pin cable for operation. hdg reduced to Ultra33 mode. ide3 at 0x8000-0x8007,0x7802 on irq 10 hde: max request size: 128KiB hde: 312581808 sectors (160041 MB) w/2048KiB Cache, CHS=19457/255/63, UDMA(100) hde: cache flushes supported hde: hde1 hde2 hde3 hde4 hdg: max request size: 128KiB hdg: 58633344 sectors (30020 MB) w/418KiB Cache, CHS=58168/16/63, UDMA(33) hdg: cache flushes not supported hdg: hdg1 hdg2 hdg3 hdg4 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - This is what I get when I boot with a 2.6.18.3 (custom) kernel *AND* with the irqpoll option already enabled. The kernel 2.6.19-rc6 that I have here doesn't work at all if I pass the irqpoll option (it just freezes right at "Uncompressing Linux" nothing is displayed at least during a minute or so---I think that it hanged). Since Linus Torvalds once said something to the effect that "users that are willing to help with patches are worth their weight in gold", I would like to contribute here. :-) I am willing to do a git bisect to see which may be a problematic patch or not, but the "irq 10: nobody cared (try booting with the "irqpoll" option)" is one that I reported to Andrew quite some time ago (I thought that it had gone away), and it didn't manifest itself until I had to reuse this extra drive, since I am doing a work that is producing a lot of data. Please, if you want any further information, don't hesitate to ask. I can test patches that are even moderately invasive, since I'm talking backups of the vital data of my system with regularity. Regards and thank you very much for any help, Rogério Brito. -- Rogério Brito : [EMAIL PROTECTED] : http://www.ime.usp.br/~rbrito Homepage of the algorithms package : http://algorithms.berlios.de Homepage on freshmeat: http://freshmeat.net/projects/algorithms/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI init discussion/SAN problem (not interesting)
Bernd Eckenfels wrote: In article <[EMAIL PROTECTED]> you wrote: Was this post just not interesting enough, or is it the lack of access to hardware to test this on that prevented it from being picked up by someone? see google, for example: http://christophe.varoqui.free.fr/multipath.html While that information is accurate, it is not new to me. I must have been unclear in my description of how the scsi device registration with the kernel causes multipath devices to function inefficiently. When a device has multiple paths, the kernel will see multiple scsi devices, even though there is only one physical device. For each of the scsi devices that the kernel can see, the partition table (or some other IO that I am unaware of) is read from the device, meaning IO is generated on ALL paths to the device. This isn't a problem for some devices, but on others it can initiate a failover process which can take many seconds, only to have the process repeated as IO is generated on a third path to the device. Is it unreasonable for the scsi initialization routines to be aware that some kernel scsi devices are really the same physical devices and register them with the kernel WITHOUT generating any IO on the physical device? Doing this there would be a maximum of one failover per physical device durint the boot sequence. This one failover could be eliminated if the scsi initialization code were aware of "active" paths and only generated IO on active paths, rather than the first path. All of this is before device mapper or multipath get thier hands on the scsi devices. It is completely within the scope of the scsi initialization code in the kernel. Is this more clear? If not, could someone ask for clearification of the fuzzy parts? Evan. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] prune_icache_sb
Andrew Morton wrote: We shouldn't export this particular implementation to modules because it has bad failure modes. There might be a case for exposing an i_sb_list-based API or, perhaps better, a max-unused-inodes mount option. Ok, thanks for looking into this - it is appreciated. I'll try to figure out something else. -- Wendy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/12] ext3 balloc: reset windowsz when full
On Tue, 28 Nov 2006, Mingming Cao wrote: > Port a series ext2 balloc patches from Hugh to ext3/4. The first 6 > patches are against ext3, and the rest are aginst ext4. Thanks for all that, Mingming: whichever is appropriate, all twelve Acked-by: Hugh Dickins <[EMAIL PROTECTED]> or Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]> I'll think about your other mails, those that need further thought, later on: I need to pin down more accurately the repetitious sequence of reservations in the mistaken case - maybe it indicated further issues, maybe not; and I need to consider our different views of the my_rsv find_next_usable_block. Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-mm2
Andrew Morton wrote: On Tue, 28 Nov 2006 19:24:45 -0500 Thomas Tuttle <[EMAIL PROTECTED]> wrote: I've found a couple of bugs so far... 1. I did `modprobe kvm' and then tried running a version of the KVM Qemu compiled for a different kernel. My mistake. But I got an oops: BUG: unable to handle kernel NULL pointer dereference at virtual address 0008 Code: 14 0f 87 77 02 00 00 8b 0c b5 00 15 20 f9 85 c9 0f 84 68 02 00 00 89 ea 89 f8 ff d1 85 c0 0f 84 4c 02 00 00 89 f8 e8 31 e9 ff ff <65> a1 08 00 00 00 8b 40 04 8b 40 08 a8 04 0f 85 ae 02 00 00 e8 EIP: [] kvm_vmx_return+0xef/0x4d0 [kvm] SS:ESP 0068:e5a4fd54 65 a1 08 00 00 00 mov%gs:0x8,%eax kvm isn't restoring gs properly. I'll look into it. Oh, and I get a ton of these messages with kvm: rtc: lost some interrupts at 1024Hz. I'll look into these too, though I'm not sure where. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm] char: drivers use/need PCI
From: Randy Dunlap <[EMAIL PROTECTED]> With CONFIG_PCI=n: drivers/char/mxser_new.c: In function 'mxser_release_res': drivers/char/mxser_new.c:2383: warning: implicit declaration of function 'pci_release_region' drivers/char/mxser_new.c: In function 'mxser_probe': drivers/char/mxser_new.c:2578: warning: implicit declaration of function 'pci_request_region' drivers/built-in.o: In function `sx_remove_card': sx.c:(.text.sx_remove_card+0x65): undefined reference to `pci_release_region' drivers/char/isicom.c: In function 'isicom_probe': drivers/char/isicom.c:1793: warning: implicit declaration of function 'pci_request_region' drivers/char/isicom.c:1827: warning: implicit declaration of function 'pci_release_region' Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> --- drivers/char/Kconfig |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- linux-2.6.19-rc6-mm2.orig/drivers/char/Kconfig +++ linux-2.6.19-rc6-mm2/drivers/char/Kconfig @@ -203,7 +203,7 @@ config MOXA_SMARTIO config MOXA_SMARTIO_NEW tristate "Moxa SmartIO support v. 2.0 (EXPERIMENTAL)" - depends on SERIAL_NONSTANDARD + depends on SERIAL_NONSTANDARD && PCI help Say Y here if you have a Moxa SmartIO multiport serial card and/or want to help develop a new version of this driver. @@ -218,7 +218,7 @@ config MOXA_SMARTIO_NEW config ISI tristate "Multi-Tech multiport card support (EXPERIMENTAL)" - depends on SERIAL_NONSTANDARD + depends on SERIAL_NONSTANDARD && PCI select FW_LOADER help This is a driver for the Multi-Tech cards which provide several @@ -312,7 +312,7 @@ config SPECIALIX_RTSCTS config SX tristate "Specialix SX (and SI) card support" - depends on SERIAL_NONSTANDARD + depends on SERIAL_NONSTANDARD && PCI help This is a driver for the SX and SI multiport serial cards. Please read the file for details. --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm] MTD: ESB2ROM uses PCI
From: Randy Dunlap <[EMAIL PROTECTED]> ESB2ROM uses PCI interface functions. With CONFIG_PCI=n: drivers/mtd/maps/esb2rom.c: In function 'esb2rom_init_one': drivers/mtd/maps/esb2rom.c:167: warning: implicit declaration of function 'pci_dev_get' Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> --- drivers/mtd/maps/Kconfig |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-2.6.19-rc6-mm2.orig/drivers/mtd/maps/Kconfig +++ linux-2.6.19-rc6-mm2/drivers/mtd/maps/Kconfig @@ -186,7 +186,7 @@ config MTD_ICHXROM config MTD_ESB2ROM tristate "BIOS flash chip on Intel ESB Controller Hub 2" -depends on X86 && MTD_JEDECPROBE +depends on X86 && MTD_JEDECPROBE && PCI help Support for treating the BIOS flash chip on ESB2 motherboards as an MTD device - with this you can reprogram your BIOS. --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [v4l-dvb-maintainer] [2.6 patch] remove DVB_AV7110_FIRMWARE
On Tue, Nov 28, 2006 at 08:45:56PM -0800, Trent Piepho wrote: > On Wed, 29 Nov 2006, Adrian Bunk wrote: > > On Tue, Nov 28, 2006 at 01:06:02PM -0800, Trent Piepho wrote: > > > On Sun, 26 Nov 2006, Adrian Bunk wrote: > > > > DVB_AV7110_FIRMWARE was (except for some OSS drivers) the only option > > > > that was still compiling a binary-only user-supplied firmware file at > > > > build-time into the kernel. > > > > > > > > This patch changes the driver to always use the standard > > > > request_firmware() way for firmware by removing DVB_AV7110_FIRMWARE. > > > > > > Doesn't this also prevent the AV7110 module from getting compiled > > > into the kernel? Shouldn't the Kconfig file be adjusted so > > > that 'y' can't be selected anymore and it depends on MODULES? > > > > No. > > No. > > > > request_firmware() works fine for built-in drivers. > > Wouldn't that require loading the firmware file before the filesystems are > mounted? Sure. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
From: Herbert Xu <[EMAIL PROTECTED]> Date: Wed, 29 Nov 2006 15:56:57 +1100 > David Miller <[EMAIL PROTECTED]> wrote: > > > > Longer term this is really messy, we should handle this some > > other way. > > Definitely. I'm not sure whether 48 is enough even for recursive > tunnels. This should really just be a hint. It's OK to spend a > bit of time reallocating skb's if it's too small, but it's not OK > to die. The recursive tunnel case is handled by the PMTU reductions in the route, isn't it? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Wed, 2006-11-29 at 15:30 +1100, Keith Owens wrote: > David Miller (on Tue, 28 Nov 2006 20:04:53 -0800 (PST)) wrote: > >From: Keith Owens > >Date: Wed, 29 Nov 2006 14:56:20 +1100 > > > >> Secondly, I believe that this is a separate problem from bug 22278. > >> hpet_readl() is correctly using volatile internally, but its result is > >> being assigned to a pair of normal integers (not declared as volatile). > >> In the context of wait_hpet_tick, all the variables are unqualified so > >> gcc is allowed to optimize the comparison away. > >> > >> The same problem may exist in other parts of arch/i386/kernel/time_hpet.c, > >> where the return value from hpet_readl() is assigned to a normal > >> variable. Nothing in the C standard says that those unqualified > >> variables should be magically treated as volatile, just because the > >> original code that extracted the value used volatile. IOW, time_hpet.c > >> needs to declare any variables that hold the result of hpet_readl() as > >> being volatile variables. > > > >I disagree with this. > > > >readl() returns values from an opaque source, and it is declared > >as such to show this to GCC. It's like a function that GCC > >cannot see the implementation of, which it cannot determine > >anything about wrt. return values. > > > >The volatile'ness does not simply disappear the moment you > >assign the result to some local variable which is not volatile. > > > >Half of our drivers would break if this were true. > > This is definitely a gcc bug, 4.1.0 is doing something weird. Compile > with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and the bug appears, > CONFIG_CC_OPTIMIZE_FOR_SIZE=y has no problem. > > Compile with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and _either_ of the patches > below and the problem disappears. > My theory: gcc is inlining readl into hpet_readl (readl is an inline function, so it should be doing this no matter what), and inlining hpet_readl into wait_hpet_tick (otherwise, it can't possibly make any assumptions about the return values of hpet_readl -- this looks to be a SUSE-specific over-aggressive optimization), and somewhere along the way the volatile qualifier is getting lost. -- Nicholas Miell <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
David Miller <[EMAIL PROTECTED]> wrote: > > Longer term this is really messy, we should handle this some > other way. Definitely. I'm not sure whether 48 is enough even for recursive tunnels. This should really just be a hint. It's OK to spend a bit of time reallocating skb's if it's too small, but it's not OK to die. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/4] sysctl: Simplify ipc ns specific sysctls
Quoting Eric W. Biederman ([EMAIL PROTECTED]): > This patch refactors the ipc sysctl support so that it is > simpler, more readable, and prepares for fixing the bug > with the wrong values being returned in the sys_sysctl interface. > > The function proc_do_ipc_string was misnamed as it never handled > strings. It's magic of when to work with strings and when to work > with longs belonged in the sysctl table. I couldn't tell if the > code would work if you disabled the ipc namespace but it certainly > looked like it would have problems. > > Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Hi, A little belated (sorry), but the only comment I have right now on the patchset is that the get_ipc() seems like it shouldn't take the write arg. Perhaps if consistency is the concern, get_uts() should simply be called get_uts_locked(table, need_write) ? This also avoids the mysterious '1' argument in the next patch at get_ipc(table, 1); Oh, I lied, one more comment. It seems worth a comment at the top of get_uts() and get_ipc() explaining that table->data points to init_uts->data and that's why the 'which = which - init_uts + uts' works. thanks, -serge - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [v4l-dvb-maintainer] [2.6 patch] remove DVB_AV7110_FIRMWARE
On Wed, 29 Nov 2006, Adrian Bunk wrote: > On Tue, Nov 28, 2006 at 01:06:02PM -0800, Trent Piepho wrote: > > On Sun, 26 Nov 2006, Adrian Bunk wrote: > > > DVB_AV7110_FIRMWARE was (except for some OSS drivers) the only option > > > that was still compiling a binary-only user-supplied firmware file at > > > build-time into the kernel. > > > > > > This patch changes the driver to always use the standard > > > request_firmware() way for firmware by removing DVB_AV7110_FIRMWARE. > > > > Doesn't this also prevent the AV7110 module from getting compiled > > into the kernel? Shouldn't the Kconfig file be adjusted so > > that 'y' can't be selected anymore and it depends on MODULES? > > No. > No. > > request_firmware() works fine for built-in drivers. Wouldn't that require loading the firmware file before the filesystems are mounted? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
From: Herbert Xu <[EMAIL PROTECTED]> Date: Wed, 29 Nov 2006 15:38:29 +1100 > David Miller <[EMAIL PROTECTED]> wrote: > > > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > > index 9264139..95e86ac 100644 > > --- a/include/linux/netdevice.h > > +++ b/include/linux/netdevice.h > > @@ -94,7 +94,9 @@ #endif > > #endif > > > > #if !defined(CONFIG_NET_IPIP) && \ > > -!defined(CONFIG_IPV6) && !defined(CONFIG_IPV6_MODULE) > > +!defined(CONFIG_NET_IPGRE) && \ > > +!defined(CONFIG_IPV6_SIT) && \ > > +!defined(CONFIG_IPV6_TUNNEL) > > #define MAX_HEADER LL_MAX_HEADER > > #else > > #define MAX_HEADER (LL_MAX_HEADER + 48) > > What if ipip/gre are modules? Good catch, I'll fix that up by adding the missing CONFIG_*_MODULE cases. Longer term this is really messy, we should handle this some other way. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
David Miller <[EMAIL PROTECTED]> wrote: > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > index 9264139..95e86ac 100644 > --- a/include/linux/netdevice.h > +++ b/include/linux/netdevice.h > @@ -94,7 +94,9 @@ #endif > #endif > > #if !defined(CONFIG_NET_IPIP) && \ > -!defined(CONFIG_IPV6) && !defined(CONFIG_IPV6_MODULE) > +!defined(CONFIG_NET_IPGRE) && \ > +!defined(CONFIG_IPV6_SIT) && \ > +!defined(CONFIG_IPV6_TUNNEL) > #define MAX_HEADER LL_MAX_HEADER > #else > #define MAX_HEADER (LL_MAX_HEADER + 48) What if ipip/gre are modules? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-mm2 is ok (2.6.19-rc1-mm1+ memory problem)
Michael Raskin wrote: I have a strange problem with 2.6.19-rc-mm kernels. After I load X, I notice that memory is marked used at rate of tens of KB/s. Then it Tried 2.6.19-rc6-mm2. Now the problem is gone. Sometimes memory is getting maked used as before, but when the loss reaches a few MB's it is all freed. After 3 hours of X+all those scripts that cause leak + ThunderBird I can still shut down everything except a few processes and have only 50MB used. Script that demonstrated leak is now working without problems and without eating memory. Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
David Miller (on Tue, 28 Nov 2006 20:04:53 -0800 (PST)) wrote: >From: Keith Owens >Date: Wed, 29 Nov 2006 14:56:20 +1100 > >> Secondly, I believe that this is a separate problem from bug 22278. >> hpet_readl() is correctly using volatile internally, but its result is >> being assigned to a pair of normal integers (not declared as volatile). >> In the context of wait_hpet_tick, all the variables are unqualified so >> gcc is allowed to optimize the comparison away. >> >> The same problem may exist in other parts of arch/i386/kernel/time_hpet.c, >> where the return value from hpet_readl() is assigned to a normal >> variable. Nothing in the C standard says that those unqualified >> variables should be magically treated as volatile, just because the >> original code that extracted the value used volatile. IOW, time_hpet.c >> needs to declare any variables that hold the result of hpet_readl() as >> being volatile variables. > >I disagree with this. > >readl() returns values from an opaque source, and it is declared >as such to show this to GCC. It's like a function that GCC >cannot see the implementation of, which it cannot determine >anything about wrt. return values. > >The volatile'ness does not simply disappear the moment you >assign the result to some local variable which is not volatile. > >Half of our drivers would break if this were true. This is definitely a gcc bug, 4.1.0 is doing something weird. Compile with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and the bug appears, CONFIG_CC_OPTIMIZE_FOR_SIZE=y has no problem. Compile with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and _either_ of the patches below and the problem disappears. Index: linux/arch/i386/kernel/time_hpet.c === --- linux.orig/arch/i386/kernel/time_hpet.c 2006-11-29 13:51:33.900462088 +1100 +++ linux/arch/i386/kernel/time_hpet.c 2006-11-29 15:25:47.853245938 +1100 @@ -35,7 +35,8 @@ static void __iomem * hpet_virt_address; int hpet_readl(unsigned long a) { - return readl(hpet_virt_address + a); + volatile int v = readl(hpet_virt_address + a); + return v; } static void hpet_writel(unsigned long d, unsigned long a) Index: linux-2.6/arch/i386/kernel/time_hpet.c === --- linux-2.6.orig/arch/i386/kernel/time_hpet.c +++ linux-2.6/arch/i386/kernel/time_hpet.c @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, */ static void __devinit wait_hpet_tick(void) { - unsigned int start_cmp_val, end_cmp_val; + unsigned volatile int start_cmp_val, end_cmp_val; start_cmp_val = hpet_readl(HPET_T0_CMP); do { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Wed, 29 Nov 2006 03:28:25 +0100 > [NETFILTER]: ipt_REJECT: fix memory corruption > > On devices with hard_header_len > LL_MAX_HEADER ip_route_me_harder() > reallocates the skb, leading to memory corruption when using the stale > tcph pointer to update the checksum. > > Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Applied, thanks Patrick. And based upon your discovery wrt. MAX_HEADER I'm also applying the following. commit 93e3a20d6c67a09b867431e7d5b3e7bc97154fab Author: David S. Miller <[EMAIL PROTECTED]> Date: Tue Nov 28 20:24:10 2006 -0800 [NET]: Fix MAX_HEADER setting. MAX_HEADER is either set to LL_MAX_HEADER or LL_MAX_HEADER + 48, and this is controlled by a set of CONFIG_* ifdef tests. It is trying to use LL_MAX_HEADER + 48 when any of the tunnels are enabled which set hard_header_len like this: dev->hard_header_len = LL_MAX_HEADER + sizeof(struct xxx); The correct set of tunnel drivers which do this are: ipip ip_gre ip6_tunnel sit so make the ifdef test match. Noticed by Patrick McHardy. Signed-off-by: David S. Miller <[EMAIL PROTECTED]> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 9264139..95e86ac 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -94,7 +94,9 @@ #endif #endif #if !defined(CONFIG_NET_IPIP) && \ -!defined(CONFIG_IPV6) && !defined(CONFIG_IPV6_MODULE) +!defined(CONFIG_NET_IPGRE) && \ +!defined(CONFIG_IPV6_SIT) && \ +!defined(CONFIG_IPV6_TUNNEL) #define MAX_HEADER LL_MAX_HEADER #else #define MAX_HEADER (LL_MAX_HEADER + 48) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 10/12] ext4 balloc: say rb_entry not list_entry
-- Subject: ext2 balloc: say rb_entry not list_entry From: Hugh Dickins <[EMAIL PROTECTED]> The reservations tree is an rb_tree not a list, so it's less confusing to use rb_entry() than list_entry() - though they're both just container_of(). -- Sync up this fix in ext4 Signed-off-by: Mingming Cao <[EMAIL PROTECTED]> --- --- linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff -puN fs/ext4/balloc.c~ext4-balloc-say-rb_entry-not-list_entry fs/ext4/balloc.c --- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-say-rb_entry-not-list_entry 2006-11-28 19:37:08.0 -0800 +++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c 2006-11-28 19:37:08.0 -0800 @@ -165,7 +165,7 @@ restart: printk("Block Allocation Reservation Windows Map (%s):\n", fn); while (n) { - rsv = list_entry(n, struct ext4_reserve_window_node, rsv_node); + rsv = rb_entry(n, struct ext4_reserve_window_node, rsv_node); if (verbose) printk("reservation window 0x%p " "start: %llu, end: %llu\n", @@ -966,7 +966,7 @@ static int find_next_reservable_window( prev = rsv; next = rb_next(>rsv_node); - rsv = list_entry(next,struct ext4_reserve_window_node,rsv_node); + rsv = rb_entry(next,struct ext4_reserve_window_node,rsv_node); /* * Reached the last reservation, we can just append to the @@ -1210,7 +1210,7 @@ static void try_to_extend_reservation(st if (!next) my_rsv->rsv_end += size; else { - next_rsv = list_entry(next, struct ext4_reserve_window_node, rsv_node); + next_rsv = rb_entry(next, struct ext4_reserve_window_node, rsv_node); if ((next_rsv->rsv_start - my_rsv->rsv_end - 1) >= size) my_rsv->rsv_end += size; _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 9/12] ext4 balloc: fix off-by-one against rsv_end
-- Subject: ext2 balloc: fix off-by-one against rsv_end From: Hugh Dickins <[EMAIL PROTECTED]> rsv_end is the last block within the reservation, so alloc_new_reservation should accept start_block == rsv_end as success. -- Sync up a ext2 reservation fix in ext4 Signed-Off-By: Mingming Cao <[EMAIL PROTECTED]> --- linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -puN fs/ext4/balloc.c~ext4-balloc-fix-off-by-one-against-rsv_end fs/ext4/balloc.c --- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-fix-off-by-one-against-rsv_end 2006-11-28 19:37:15.0 -0800 +++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c 2006-11-28 19:37:15.0 -0800 @@ -1165,7 +1165,7 @@ retry: * check if the first free block is within the * free space we just reserved */ - if (start_block >= my_rsv->rsv_start && start_block < my_rsv->rsv_end) + if (start_block >= my_rsv->rsv_start && start_block <= my_rsv->rsv_end) return 0; /* success */ /* * if the first free bit we found is out of the reservable space _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/12] ext3 balloc: fix off-by-one against grp_goal
-- Subject: ext2 balloc: fix off-by-one against grp_goal From: Hugh Dickins <[EMAIL PROTECTED]> grp_goal 0 is a genuine goal (unlike -1), so ext2_try_to_allocate_with_rsv should treat it as such. -- Sync up with ext2 reservation fix in ext3 Signed-off-by: Mingming Cao <[EMAIL PROTECTED]> --- --- linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -puN fs/ext3/balloc.c~ext3-balloc-fix-off-by-one-against-grp_goal fs/ext3/balloc.c --- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-fix-off-by-one-against-grp_goal 2006-11-28 19:36:48.0 -0800 +++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c 2006-11-28 19:36:48.0 -0800 @@ -1271,7 +1271,7 @@ ext3_try_to_allocate_with_rsv(struct sup } /* * grp_goal is a group relative block number (if there is a goal) -* 0 < grp_goal < EXT3_BLOCKS_PER_GROUP(sb) +* 0 <= grp_goal < EXT3_BLOCKS_PER_GROUP(sb) * first block is a filesystem wide block number * first block is the block number of the first block in this group */ @@ -1307,7 +1307,7 @@ ext3_try_to_allocate_with_rsv(struct sup if (!goal_in_my_reservation(_rsv->rsv_window, grp_goal, group, sb)) grp_goal = -1; - } else if (grp_goal > 0) { + } else if (grp_goal >= 0) { int curr = my_rsv->rsv_end - (grp_goal + group_first_block) + 1; _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 12/12] ext3 balloc: fix _with_rsv freeze
-- Subject: ext2 balloc: fix _with_rsv freeze From: Hugh Dickins <[EMAIL PROTECTED]> After several days of testing ext2 with reservations, it got caught inside ext2_try_to_allocate_with_rsv: alloc_new_reservation repeatedly succeeding on the window [12cff,12d0e], ext2_try_to_allocate repeatedly failing to find the free block guaranteed to be included (unless there's contention). Fix the range to find_next_usable_block's memscan: the scan from "here" (0xcfe) up to (but excluding) "maxblocks" (0xd0e) needs to scan 3 bytes not 2 (the relevant bytes of bitmap in this case being f7 df ff - none 00, but the premature cutoff implying that the last was found 00). Is this a problem for mainline ext2? No, because the "size" in its memscan is always EXT2_BLOCKS_PER_GROUP(sb), which mkfs.ext2 requires to be a multiple of 8. Is this a problem for ext3 or ext4? No, because they have an additional extN_test_allocatable test which rescues them from the error. -- Sync up a reservation fix from ext2 in ext4 Signed-off-by: Mingming Cao <[EMAIL PROTECTED]> --- linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -puN fs/ext4/balloc.c~ext4-balloc-fix-_with_rsv-freeze fs/ext4/balloc.c --- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-fix-_with_rsv-freeze 2006-11-28 19:37:12.0 -0800 +++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c 2006-11-28 19:37:12.0 -0800 @@ -747,7 +747,7 @@ find_next_usable_block(ext4_grpblk_t sta here = 0; p = ((char *)bh->b_data) + (here >> 3); - r = memscan(p, 0, (maxblocks - here + 7) >> 3); + r = memscan(p, 0, ((maxblocks + 7) >> 3 - (here >> 3)); next = (r - ((char *)bh->b_data)) << 3; if (next < maxblocks && next >= start && ext4_test_allocatable(next, bh)) _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/12] ext3 balloc: use io_error label
-- Subject: ext2 balloc: use io_error label From: Hugh Dickins <[EMAIL PROTECTED]> ext2_new_blocks has a nice io_error label for setting -EIO, so goto that in the one place that doesn't already use it. -- Fix it in ext3_new_blocks. Signed-off-by: Mingming Cao <[EMAIL PROTECTED]> --- linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff -puN fs/ext3/balloc.c~ext3-balloc-use-io_error-label fs/ext3/balloc.c --- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-use-io_error-label 2006-11-28 19:45:51.0 -0800 +++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c 2006-11-28 19:45:51.0 -0800 @@ -1515,10 +1515,8 @@ retry_alloc: if (group_no >= ngroups) group_no = 0; gdp = ext3_get_group_desc(sb, group_no, _bh); - if (!gdp) { - *errp = -EIO; - goto out; - } + if (!gdp) + goto io_error; free_blocks = le16_to_cpu(gdp->bg_free_blocks_count); /* * skip this group if the number of _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/12] ext2 balloc: fix _with_rsv freeze
Sync up a reservation fix from ext2 in ext3 -- Subject: ext2 balloc: fix _with_rsv freeze From: Hugh Dickins <[EMAIL PROTECTED]> After several days of testing ext2 with reservations, it got caught inside ext2_try_to_allocate_with_rsv: alloc_new_reservation repeatedly succeeding on the window [12cff,12d0e], ext2_try_to_allocate repeatedly failing to find the free block guaranteed to be included (unless there's contention). Fix the range to find_next_usable_block's memscan: the scan from "here" (0xcfe) up to (but excluding) "maxblocks" (0xd0e) needs to scan 3 bytes not 2 (the relevant bytes of bitmap in this case being f7 df ff - none 00, but the premature cutoff implying that the last was found 00). Is this a problem for mainline ext2? No, because the "size" in its memscan is always EXT2_BLOCKS_PER_GROUP(sb), which mkfs.ext2 requires to be a multiple of 8. Is this a problem for ext3 or ext4? No, because they have an additional extN_test_allocatable test which rescues them from the error. -- Signed-off-by: Mingming Cao <[EMAIL PROTECTED]> --- linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -puN fs/ext3/balloc.c~ext3-balloc-fix-_with_rsv-freeze fs/ext3/balloc.c --- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-fix-_with_rsv-freeze 2006-11-28 19:36:55.0 -0800 +++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c 2006-11-28 19:36:55.0 -0800 @@ -730,7 +730,7 @@ find_next_usable_block(ext3_grpblk_t sta here = 0; p = ((char *)bh->b_data) + (here >> 3); - r = memscan(p, 0, (maxblocks - here + 7) >> 3); + r = memscan(p, 0, ((maxblocks + 7) >> 3) - (here >> 3)); next = (r - ((char *)bh->b_data)) << 3; if (next < maxblocks && next >= start && ext3_test_allocatable(next, bh)) _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 8/12] ext4 balloc: fix off-by-one against grp_goal
Subject: ext2 balloc: fix off-by-one against grp_goal From: Hugh Dickins <[EMAIL PROTECTED]> grp_goal 0 is a genuine goal (unlike -1), so ext2_try_to_allocate_with_rsv should treat it as such. -- Sync up with ext2 reservation fix in ext4 Signed-off-by: Mingming Cao <[EMAIL PROTECTED]> --- --- linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -puN fs/ext4/balloc.c~ext4-balloc-fix-off-by-one-against-grp_goal fs/ext4/balloc.c --- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-fix-off-by-one-against-grp_goal 2006-11-28 19:37:05.0 -0800 +++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c 2006-11-28 19:37:05.0 -0800 @@ -1288,7 +1288,7 @@ ext4_try_to_allocate_with_rsv(struct sup } /* * grp_goal is a group relative block number (if there is a goal) -* 0 < grp_goal < EXT4_BLOCKS_PER_GROUP(sb) +* 0 <= grp_goal < EXT4_BLOCKS_PER_GROUP(sb) * first block is a filesystem wide block number * first block is the block number of the first block in this group */ @@ -1324,7 +1324,7 @@ ext4_try_to_allocate_with_rsv(struct sup if (!goal_in_my_reservation(_rsv->rsv_window, grp_goal, group, sb)) grp_goal = -1; - } else if (grp_goal > 0) { + } else if (grp_goal >= 0) { int curr = my_rsv->rsv_end - (grp_goal + group_first_block) + 1; _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/12] ext4 balloc: reset windowsz when full
-- Subject: ext2 balloc: reset windowsz when full From: Hugh Dickins <[EMAIL PROTECTED]> ext2_new_blocks should reset the reservation window size to 0 when squeezing the last blocks out of an almost full filesystem, so the retry doesn't skip any groups with less than half that free, reporting ENOSPC too soon. -- Sync up reservation fix from ext2 in ext4 Signed-off-by: Mingming Cao <[EMAIL PROTECTED]> --- --- linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |1 + 1 file changed, 1 insertion(+) diff -puN fs/ext4/balloc.c~ext4_reset_windowsz_in_full_fs fs/ext4/balloc.c --- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4_reset_windowsz_in_full_fs 2006-11-28 19:37:01.0 -0800 +++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c 2006-11-28 19:37:01.0 -0800 @@ -1566,6 +1566,7 @@ retry_alloc: */ if (my_rsv) { my_rsv = NULL; + windowsz = 0; group_no = goal_group; goto retry_alloc; } _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 11/12] ext4 balloc: use io_error label
-- Subject: ext2 balloc: use io_error label From: Hugh Dickins <[EMAIL PROTECTED]> ext2_new_blocks has a nice io_error label for setting -EIO, so goto that in the one place that doesn't already use it. -- Fix it in ext4_new_blocks. Signed-off-by: Mingming Cao <[EMAIL PROTECTED]> --- linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff -puN fs/ext4/balloc.c~ext4-balloc-use-io_error-label fs/ext4/balloc.c --- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-use-io_error-label 2006-11-28 19:42:45.0 -0800 +++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c 2006-11-28 19:43:21.0 -0800 @@ -1529,10 +1529,8 @@ retry_alloc: if (group_no >= ngroups) group_no = 0; gdp = ext4_get_group_desc(sb, group_no, _bh); - if (!gdp) { - *errp = -EIO; - goto out; - } + if (!gdp) + goto io_error; free_blocks = le16_to_cpu(gdp->bg_free_blocks_count); /* * skip this group if the number of _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/12] ext3 balloc: say rb_entry not list_entry
-- Subject: ext2 balloc: say rb_entry not list_entry From: Hugh Dickins <[EMAIL PROTECTED]> The reservations tree is an rb_tree not a list, so it's less confusing to use rb_entry() than list_entry() - though they're both just container_of(). -- Sync up this fix in ext3 Signed-off-by: Mingming Cao <[EMAIL PROTECTED]> --- --- linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff -puN fs/ext3/balloc.c~ext3-balloc-say-rb_entry-not-list_entry fs/ext3/balloc.c --- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-say-rb_entry-not-list_entry 2006-11-28 19:36:52.0 -0800 +++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c 2006-11-28 19:36:52.0 -0800 @@ -144,7 +144,7 @@ restart: printk("Block Allocation Reservation Windows Map (%s):\n", fn); while (n) { - rsv = list_entry(n, struct ext3_reserve_window_node, rsv_node); + rsv = rb_entry(n, struct ext3_reserve_window_node, rsv_node); if (verbose) printk("reservation window 0x%p " "start: %lu, end: %lu\n", @@ -949,7 +949,7 @@ static int find_next_reservable_window( prev = rsv; next = rb_next(>rsv_node); - rsv = list_entry(next,struct ext3_reserve_window_node,rsv_node); + rsv = rb_entry(next,struct ext3_reserve_window_node,rsv_node); /* * Reached the last reservation, we can just append to the @@ -1193,7 +1193,7 @@ static void try_to_extend_reservation(st if (!next) my_rsv->rsv_end += size; else { - next_rsv = list_entry(next, struct ext3_reserve_window_node, rsv_node); + next_rsv = rb_entry(next, struct ext3_reserve_window_node, rsv_node); if ((next_rsv->rsv_start - my_rsv->rsv_end - 1) >= size) my_rsv->rsv_end += size; _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/12] ext3 balloc: fix off-by-one against rsv_end
-- Subject: ext2 balloc: fix off-by-one against rsv_end From: Hugh Dickins <[EMAIL PROTECTED]> rsv_end is the last block within the reservation, so alloc_new_reservation should accept start_block == rsv_end as success. -- Sync up a ext2 reservation fix in ext3 Signed-Off-By: Mingming Cao <[EMAIL PROTECTED]> --- linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -puN fs/ext3/balloc.c~ext3-balloc-fix-off-by-one-against-rsv_end fs/ext3/balloc.c --- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-fix-off-by-one-against-rsv_end 2006-11-28 19:36:58.0 -0800 +++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c 2006-11-28 19:36:58.0 -0800 @@ -1148,7 +1148,7 @@ retry: * check if the first free block is within the * free space we just reserved */ - if (start_block >= my_rsv->rsv_start && start_block < my_rsv->rsv_end) + if (start_block >= my_rsv->rsv_start && start_block <= my_rsv->rsv_end) return 0; /* success */ /* * if the first free bit we found is out of the reservable space _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/12] ext3 balloc: reset windowsz when full
Port a series ext2 balloc patches from Hugh to ext3/4. The first 6 patches are against ext3, and the rest are aginst ext4. -- Subject: ext2 balloc: reset windowsz when full From: Hugh Dickins <[EMAIL PROTECTED]> ext2_new_blocks should reset the reservation window size to 0 when squeezing the last blocks out of an almost full filesystem, so the retry doesn't skip any groups with less than half that free, reporting ENOSPC too soon. -- Sync up reservation fix from ext2 Signed-off-by: Mingming Cao <[EMAIL PROTECTED]> --- --- linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |1 + 1 file changed, 1 insertion(+) diff -puN fs/ext3/balloc.c~ext3_reset_windowsz_in_full_fs fs/ext3/balloc.c --- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3_reset_windowsz_in_full_fs 2006-11-28 19:36:41.0 -0800 +++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c 2006-11-28 19:36:41.0 -0800 @@ -1552,6 +1552,7 @@ retry_alloc: */ if (my_rsv) { my_rsv = NULL; + windowsz = 0; group_no = goal_group; goto retry_alloc; } _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
From: Keith Owens Date: Wed, 29 Nov 2006 14:56:20 +1100 > Secondly, I believe that this is a separate problem from bug 22278. > hpet_readl() is correctly using volatile internally, but its result is > being assigned to a pair of normal integers (not declared as volatile). > In the context of wait_hpet_tick, all the variables are unqualified so > gcc is allowed to optimize the comparison away. > > The same problem may exist in other parts of arch/i386/kernel/time_hpet.c, > where the return value from hpet_readl() is assigned to a normal > variable. Nothing in the C standard says that those unqualified > variables should be magically treated as volatile, just because the > original code that extracted the value used volatile. IOW, time_hpet.c > needs to declare any variables that hold the result of hpet_readl() as > being volatile variables. I disagree with this. readl() returns values from an opaque source, and it is declared as such to show this to GCC. It's like a function that GCC cannot see the implementation of, which it cannot determine anything about wrt. return values. The volatile'ness does not simply disappear the moment you assign the result to some local variable which is not volatile. Half of our drivers would break if this were true. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.16.33
On Mon, Nov 27, 2006 at 11:45:30AM +, Ian Campbell wrote: > Hi Adrian, > > On Thu, 2006-11-23 at 01:05 +0100, Adrian Bunk wrote: > > ftp://ftp.kernel.org/pub/linux/kernel/v2.6/ > > I can see the changelog and the patch but not the whole tarball. Does > that take longer to appear? PEBKAC ;-) I forgot to copy it. Thanks for your reminder, it's now there. > Cheers, > Ian. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Nicholas Miell (on Tue, 28 Nov 2006 19:08:25 -0800) wrote: >On Wed, 2006-11-29 at 13:22 +1100, Keith Owens wrote: >> Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), >> wait_hpet_tick is optimized away to a never ending loop and the kernel >> hangs on boot in timer setup. >> >> 001a : >> 1a: 55 push %ebp >> 1b: 89 e5 mov%esp,%ebp >> 1d: eb fe jmp1d >> >> This is not a problem with gcc 3.3.5. Adding barrier() calls to >> wait_hpet_tick does not help, making the variables volatile does. >> >> Signed-off-by: Keith Owens >> >> --- >> arch/i386/kernel/time_hpet.c |2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> Index: linux-2.6/arch/i386/kernel/time_hpet.c >> === >> --- linux-2.6.orig/arch/i386/kernel/time_hpet.c >> +++ linux-2.6/arch/i386/kernel/time_hpet.c >> @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, >> */ >> static void __devinit wait_hpet_tick(void) >> { >> -unsigned int start_cmp_val, end_cmp_val; >> +unsigned volatile int start_cmp_val, end_cmp_val; >> >> start_cmp_val = hpet_readl(HPET_T0_CMP); >> do { > >When you examine the inlined functions involved, this looks an awful lot >like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278 > >Perhaps SUSE should fix their gcc instead of working around compiler >problems in the kernel? Firstly, the fix for 22278 is included in gcc 4.1.0. Secondly, I believe that this is a separate problem from bug 22278. hpet_readl() is correctly using volatile internally, but its result is being assigned to a pair of normal integers (not declared as volatile). In the context of wait_hpet_tick, all the variables are unqualified so gcc is allowed to optimize the comparison away. The same problem may exist in other parts of arch/i386/kernel/time_hpet.c, where the return value from hpet_readl() is assigned to a normal variable. Nothing in the C standard says that those unqualified variables should be magically treated as volatile, just because the original code that extracted the value used volatile. IOW, time_hpet.c needs to declare any variables that hold the result of hpet_readl() as being volatile variables. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] lib functions: always build hweight for loadable modules
From: Randy Dunlap <[EMAIL PROTECTED]> Always build hweight8/16/32/64() functions into the kernel so that loadable modules may use them. I didn't remove GENERIC_HWEIGHT since ALPHA_EV67, ia64, and some variants of UltraSparc(64) provide their own hweight functions. Fixes config/build problems with NTFS=m and JOYSTICK_ANALOG=m. Kernel: arch/x86_64/boot/bzImage is ready (#19) Building modules, stage 2. MODPOST 94 modules WARNING: "hweight32" [fs/ntfs/ntfs.ko] undefined! WARNING: "hweight16" [drivers/input/joystick/analog.ko] undefined! WARNING: "hweight8" [drivers/input/joystick/analog.ko] undefined! make[1]: *** [__modpost] Error 1 make: *** [modules] Error 2 Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> --- lib/Makefile |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-2.6.19-rc6-git10.orig/lib/Makefile +++ linux-2.6.19-rc6-git10/lib/Makefile @@ -25,7 +25,7 @@ lib-$(CONFIG_RWSEM_GENERIC_SPINLOCK) += lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o lib-$(CONFIG_SEMAPHORE_SLEEPERS) += semaphore-sleepers.o lib-$(CONFIG_GENERIC_FIND_NEXT_BIT) += find_next_bit.o -lib-$(CONFIG_GENERIC_HWEIGHT) += hweight.o +obj-$(CONFIG_GENERIC_HWEIGHT) += hweight.o obj-$(CONFIG_LOCK_KERNEL) += kernel_lock.o obj-$(CONFIG_PLIST) += plist.o obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Wed, 2006-11-29 at 13:22 +1100, Keith Owens wrote: > Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), > wait_hpet_tick is optimized away to a never ending loop and the kernel > hangs on boot in timer setup. > > 001a : > 1a: 55 push %ebp > 1b: 89 e5 mov%esp,%ebp > 1d: eb fe jmp1d > > This is not a problem with gcc 3.3.5. Adding barrier() calls to > wait_hpet_tick does not help, making the variables volatile does. > > Signed-off-by: Keith Owens > > --- > arch/i386/kernel/time_hpet.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Index: linux-2.6/arch/i386/kernel/time_hpet.c > === > --- linux-2.6.orig/arch/i386/kernel/time_hpet.c > +++ linux-2.6/arch/i386/kernel/time_hpet.c > @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, > */ > static void __devinit wait_hpet_tick(void) > { > - unsigned int start_cmp_val, end_cmp_val; > + unsigned volatile int start_cmp_val, end_cmp_val; > > start_cmp_val = hpet_readl(HPET_T0_CMP); > do { When you examine the inlined functions involved, this looks an awful lot like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278 Perhaps SUSE should fix their gcc instead of working around compiler problems in the kernel? -- Nicholas Miell <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [v4l-dvb-maintainer] [2.6 patch] remove DVB_AV7110_FIRMWARE
On Tue, Nov 28, 2006 at 01:06:02PM -0800, Trent Piepho wrote: > On Sun, 26 Nov 2006, Adrian Bunk wrote: > > DVB_AV7110_FIRMWARE was (except for some OSS drivers) the only option > > that was still compiling a binary-only user-supplied firmware file at > > build-time into the kernel. > > > > This patch changes the driver to always use the standard > > request_firmware() way for firmware by removing DVB_AV7110_FIRMWARE. > > Doesn't this also prevent the AV7110 module from getting compiled > into the kernel? Shouldn't the Kconfig file be adjusted so > that 'y' can't be selected anymore and it depends on MODULES? No. No. request_firmware() works fine for built-in drivers. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/5][time][x86_64] hpet_address cleanup
In preparation for supporting generic timekeeping, this patch cleans up x86-64's use of vxtime.hpet_address, changing it to just hpet_address as is also used in i386. This is necessary since the vxtime structure will be going away. Signed-off-by: John Stultz <[EMAIL PROTECTED]> arch/i386/kernel/acpi/boot.c | 23 ++- arch/x86_64/kernel/apic.c|3 ++- arch/x86_64/kernel/time.c| 36 +++- include/asm-x86_64/hpet.h|1 + 4 files changed, 28 insertions(+), 35 deletions(-) linux-2.6.19-rc6git11_timeofday-arch-x86-64-hpet-address-cleanup_C7.patch diff --git a/arch/i386/kernel/acpi/boot.c b/arch/i386/kernel/acpi/boot.c index d12fb97..b9e9f17 100644 --- a/arch/i386/kernel/acpi/boot.c +++ b/arch/i386/kernel/acpi/boot.c @@ -638,6 +638,7 @@ static int __init acpi_parse_sbf(unsigne } #ifdef CONFIG_HPET_TIMER +#include static int __init acpi_parse_hpet(unsigned long phys, unsigned long size) { @@ -671,32 +672,20 @@ #define HPET_RESOURCE_NAME_SIZE 9 hpet_res->end = (1 * 1024) - 1; } + hpet_address = hpet_tbl->addr.addrl; #ifdef CONFIG_X86_64 - vxtime.hpet_address = hpet_tbl->addr.addrl | - ((long)hpet_tbl->addr.addrh << 32); - + hpet_address |= ((long)hpet_tbl->addr.addrh << 32); +#endif printk(KERN_INFO PREFIX "HPET id: %#x base: %#lx\n", - hpet_tbl->id, vxtime.hpet_address); - - res_start = vxtime.hpet_address; -#else /* X86 */ - { - extern unsigned long hpet_address; + hpet_tbl->id, hpet_address); - hpet_address = hpet_tbl->addr.addrl; - printk(KERN_INFO PREFIX "HPET id: %#x base: %#lx\n", - hpet_tbl->id, hpet_address); - - res_start = hpet_address; - } -#endif /* X86 */ + res_start = hpet_address; if (hpet_res) { hpet_res->start = res_start; hpet_res->end += res_start; insert_resource(_resource, hpet_res); } - return 0; } #else diff --git a/arch/x86_64/kernel/apic.c b/arch/x86_64/kernel/apic.c index 4d9d5ed..02f5961 100644 --- a/arch/x86_64/kernel/apic.c +++ b/arch/x86_64/kernel/apic.c @@ -36,6 +36,7 @@ #include #include #include #include +#include #include int apic_mapped; @@ -673,7 +674,7 @@ static void setup_APIC_timer(unsigned in local_irq_save(flags); /* wait for irq slice */ - if (vxtime.hpet_address && hpet_use_timer) { + if (hpet_address && hpet_use_timer) { int trigger = hpet_readl(HPET_T0_CMP); while (hpet_readl(HPET_COUNTER) >= trigger) /* do nothing */ ; diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c index e3ef544..a6820e0 100644 --- a/arch/x86_64/kernel/time.c +++ b/arch/x86_64/kernel/time.c @@ -67,6 +67,7 @@ #define US_SCALE 32 /* 2^32, arbitralril unsigned int cpu_khz; /* TSC clocks / usec, not used here */ EXPORT_SYMBOL(cpu_khz); +unsigned long hpet_address; static unsigned long hpet_period; /* fsecs / HPET clock */ unsigned long hpet_tick; /* HPET clocks / interrupt */ int hpet_use_timer;/* Use counter of hpet for time keeping, otherwise PIT */ @@ -316,7 +317,7 @@ static noinline void handle_lost_ticks(i KERN_WARNING "Your time source seems to be instable or " "some driver is hogging interupts\n"); print_symbol("rip %s\n", get_irq_regs()->rip); - if (vxtime.mode == VXTIME_TSC && vxtime.hpet_address) { + if (vxtime.mode == VXTIME_TSC && hpet_address) { printk(KERN_WARNING "Falling back to HPET\n"); if (hpet_use_timer) vxtime.last = hpet_readl(HPET_T0_CMP) - @@ -324,6 +325,7 @@ static noinline void handle_lost_ticks(i else vxtime.last = hpet_readl(HPET_COUNTER); vxtime.mode = VXTIME_HPET; + vxtime.hpet_address = hpet_address; do_gettimeoffset = do_gettimeoffset_hpet; } /* else should fall back to PIT, but code missing. */ @@ -354,7 +356,7 @@ void main_timer_handler(void) write_seqlock(_lock); - if (vxtime.hpet_address) + if (hpet_address) offset = hpet_readl(HPET_COUNTER); if (hpet_use_timer) { @@ -717,7 +719,7 @@ static __init int late_hpet_init(void) struct hpet_datahd; unsigned intntimer; - if (!vxtime.hpet_address) + if (!hpet_address) return 0; memset(, 0,
[PATCH 5/5][time][x86_64] Re-enable vsyscall support for x86_64
Cleanup and re-enable vsyscall gettimeofday using the generic clocksource infrastructure. Signed-off-by: John Stultz <[EMAIL PROTECTED]> arch/x86_64/Kconfig |4 + arch/x86_64/kernel/hpet.c|6 + arch/x86_64/kernel/time.c|6 - arch/x86_64/kernel/tsc.c |7 ++ arch/x86_64/kernel/vmlinux.lds.S | 28 +++-- arch/x86_64/kernel/vsyscall.c| 121 +++ include/asm-x86_64/proto.h |3 include/asm-x86_64/timex.h |1 include/asm-x86_64/vsyscall.h| 32 +- 9 files changed, 105 insertions(+), 103 deletions(-) linux-2.6.19-rc6git11_timeofday-arch-x86-64-vsyscall-reenablement_C7.patch diff --git a/arch/x86_64/Kconfig b/arch/x86_64/Kconfig index 20bcd6d..c8026f8 100644 --- a/arch/x86_64/Kconfig +++ b/arch/x86_64/Kconfig @@ -28,6 +28,10 @@ config GENERIC_TIME bool default y +config GENERIC_TIME_VSYSCALL + bool + default y + config ZONE_DMA32 bool default y diff --git a/arch/x86_64/kernel/hpet.c b/arch/x86_64/kernel/hpet.c index c00b01a..2d3aed1 100644 --- a/arch/x86_64/kernel/hpet.c +++ b/arch/x86_64/kernel/hpet.c @@ -440,6 +440,11 @@ static cycle_t read_hpet(void) return (cycle_t)readl(hpet_ptr); } +static cycle_t __vsyscall_fn vread_hpet(void) +{ + return (cycle_t)readl((void *)fix_to_virt(VSYSCALL_HPET) + 0xf0); +} + struct clocksource clocksource_hpet = { .name = "hpet", .rating = 250, @@ -448,6 +453,7 @@ struct clocksource clocksource_hpet = { .mult = 0, /* set below */ .shift = HPET_SHIFT, .is_continuous = 1, + .vread = vread_hpet, }; static int __init init_hpet_clocksource(void) diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c index 4bc737c..17bb7de 100644 --- a/arch/x86_64/kernel/time.c +++ b/arch/x86_64/kernel/time.c @@ -53,13 +53,7 @@ DEFINE_SPINLOCK(rtc_lock); EXPORT_SYMBOL(rtc_lock); DEFINE_SPINLOCK(i8253_lock); -unsigned long vxtime_hz = PIT_TICK_RATE; - -struct vxtime_data __vxtime __section_vxtime; /* for vsyscalls */ - volatile unsigned long __jiffies __section_jiffies = INITIAL_JIFFIES; -struct timespec __xtime __section_xtime; -struct timezone __sys_tz __section_sys_tz; unsigned long profile_pc(struct pt_regs *regs) { diff --git a/arch/x86_64/kernel/tsc.c b/arch/x86_64/kernel/tsc.c index 682e122..5c768cf 100644 --- a/arch/x86_64/kernel/tsc.c +++ b/arch/x86_64/kernel/tsc.c @@ -185,6 +185,12 @@ static cycle_t read_tsc(void) return ret; } +static cycle_t __vsyscall_fn vread_tsc(void) +{ + cycle_t ret = (cycle_t)get_cycles_sync(); + return ret; +} + static struct clocksource clocksource_tsc = { .name = "tsc", .rating = 300, @@ -194,6 +200,7 @@ static struct clocksource clocksource_ts .shift = 22, .update_callback= tsc_update_callback, .is_continuous = 1, + .vread = vread_tsc, }; static int tsc_update_callback(void) diff --git a/arch/x86_64/kernel/vmlinux.lds.S b/arch/x86_64/kernel/vmlinux.lds.S index d9534e7..5b10798 100644 --- a/arch/x86_64/kernel/vmlinux.lds.S +++ b/arch/x86_64/kernel/vmlinux.lds.S @@ -94,31 +94,25 @@ #define VVIRT(x) (ADDR(x) - VVIRT_OFFSET __vsyscall_0 = VSYSCALL_VIRT_ADDR; . = ALIGN(CONFIG_X86_L1_CACHE_BYTES); - .xtime_lock : AT(VLOAD(.xtime_lock)) { *(.xtime_lock) } - xtime_lock = VVIRT(.xtime_lock); - - .vxtime : AT(VLOAD(.vxtime)) { *(.vxtime) } - vxtime = VVIRT(.vxtime); + .vsyscall_fn : AT(VLOAD(.vsyscall_fn)) { *(.vsyscall_fn) } + . = ALIGN(CONFIG_X86_L1_CACHE_BYTES); + .vsyscall_gtod_data : AT(VLOAD(.vsyscall_gtod_data)) + { *(.vsyscall_gtod_data) } + vsyscall_gtod_data = VVIRT(.vsyscall_gtod_data); .vgetcpu_mode : AT(VLOAD(.vgetcpu_mode)) { *(.vgetcpu_mode) } vgetcpu_mode = VVIRT(.vgetcpu_mode); - .sys_tz : AT(VLOAD(.sys_tz)) { *(.sys_tz) } - sys_tz = VVIRT(.sys_tz); - - .sysctl_vsyscall : AT(VLOAD(.sysctl_vsyscall)) { *(.sysctl_vsyscall) } - sysctl_vsyscall = VVIRT(.sysctl_vsyscall); - - .xtime : AT(VLOAD(.xtime)) { *(.xtime) } - xtime = VVIRT(.xtime); - . = ALIGN(CONFIG_X86_L1_CACHE_BYTES); .jiffies : AT(VLOAD(.jiffies)) { *(.jiffies) } jiffies = VVIRT(.jiffies); - .vsyscall_1 ADDR(.vsyscall_0) + 1024: AT(VLOAD(.vsyscall_1)) { *(.vsyscall_1) } - .vsyscall_2 ADDR(.vsyscall_0) + 2048: AT(VLOAD(.vsyscall_2)) { *(.vsyscall_2) } - .vsyscall_3 ADDR(.vsyscall_0) + 3072: AT(VLOAD(.vsyscall_3)) { *(.vsyscall_3) } + .vsyscall_1 ADDR(.vsyscall_0) + 1024: AT(VLOAD(.vsyscall_1)) + { *(.vsyscall_1) } + .vsyscall_2 ADDR(.vsyscall_0) + 2048: AT(VLOAD(.vsyscall_2)) + { *(.vsyscall_2) } + .vsyscall_3 ADDR(.vsyscall_0) + 3072: AT(VLOAD(.vsyscall_3)) + { *(.vsyscall_3) }
[PATCH 1/5][time][Generic] vsyscall-gtod support for GENERIC_TIME
Provides generic infrastructure for vsyscall-gtod. Signed-off-by: John Stultz <[EMAIL PROTECTED]> include/linux/clocksource.h |8 kernel/timer.c |1 + 2 files changed, 9 insertions(+) linux-2.6.19-rc6git11_timeofday-vsyscall-support_C7.patch diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h index d852024..62a600d 100644 --- a/include/linux/clocksource.h +++ b/include/linux/clocksource.h @@ -46,6 +46,7 @@ typedef u64 cycle_t; * @shift: cycle to nanosecond divisor (power of two) * @update_callback: called when safe to alter clocksource values * @is_continuous: defines if clocksource is free-running. + * @vread: vsyscall based read * @cycle_interval:Used internally by timekeeping core, please ignore. * @xtime_interval:Used internally by timekeeping core, please ignore. */ @@ -59,6 +60,7 @@ struct clocksource { u32 shift; int (*update_callback)(void); int is_continuous; + cycle_t (*vread)(void); /* timekeeping specific data, ignore */ cycle_t cycle_last, cycle_interval; @@ -182,4 +184,10 @@ int clocksource_register(struct clocksou void clocksource_reselect(void); struct clocksource* clocksource_get_next(void); +#ifdef CONFIG_GENERIC_TIME_VSYSCALL +extern void update_vsyscall(struct timespec *ts, struct clocksource *c); +#else +#define update_vsyscall(now, c) do { } while(0) +#endif + #endif /* _LINUX_CLOCKSOURCE_H */ diff --git a/kernel/timer.c b/kernel/timer.c index c1c7fbc..38fd4a7 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -956,6 +956,7 @@ #endif clock->xtime_nsec = 0; clocksource_calculate_interval(clock, tick_nsec); } + update_vsyscall(, clock); } /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/5][time][x86_64] Convert x86_64 to use GENERIC_TIME
This patch converts x86_64 to use the GENERIC_TIME infrastructure and adds clocksource structures for both TSC and HPET (ACPI PM is shared w/ i386). Signed-off-by: John Stultz <[EMAIL PROTECTED]> arch/x86_64/Kconfig |4 arch/x86_64/kernel/apic.c|2 arch/x86_64/kernel/hpet.c| 65 - arch/x86_64/kernel/pmtimer.c | 58 arch/x86_64/kernel/smpboot.c |1 arch/x86_64/kernel/time.c| 301 --- arch/x86_64/kernel/tsc.c | 105 +-- include/asm-x86_64/proto.h |1 include/asm-x86_64/timex.h |5 9 files changed, 133 insertions(+), 409 deletions(-) linux-2.6.19-rc6git11_timeofday-arch-x86-64-generic-time-conversion_C7.patch diff --git a/arch/x86_64/Kconfig b/arch/x86_64/Kconfig index 010d226..20bcd6d 100644 --- a/arch/x86_64/Kconfig +++ b/arch/x86_64/Kconfig @@ -24,6 +24,10 @@ config X86 bool default y +config GENERIC_TIME + bool + default y + config ZONE_DMA32 bool default y diff --git a/arch/x86_64/kernel/apic.c b/arch/x86_64/kernel/apic.c index 02f5961..588ef3d 100644 --- a/arch/x86_64/kernel/apic.c +++ b/arch/x86_64/kernel/apic.c @@ -696,7 +696,7 @@ static void setup_APIC_timer(unsigned in /* Turn off PIT interrupt if we use APIC timer as main timer. Only works with the PM timer right now TBD fix it for HPET too. */ - if (vxtime.mode == VXTIME_PMTMR && + if ((pmtmr_ioport != 0) && smp_processor_id() == boot_cpu_id && apic_runs_main_timer == 1 && !cpu_isset(boot_cpu_id, timer_interrupt_broadcast_ipi_mask)) { diff --git a/arch/x86_64/kernel/hpet.c b/arch/x86_64/kernel/hpet.c index a219786..c00b01a 100644 --- a/arch/x86_64/kernel/hpet.c +++ b/arch/x86_64/kernel/hpet.c @@ -19,12 +19,6 @@ unsigned long hpet_tick; /* HPET clocks int hpet_use_timer;/* Use counter of hpet for time keeping, * otherwise PIT */ -unsigned int do_gettimeoffset_hpet(void) -{ - /* cap counter read to one tick to avoid inconsistencies */ - unsigned long counter = hpet_readl(HPET_COUNTER) - vxtime.last; - return (min(counter,hpet_tick) * vxtime.quot) >> US_SCALE; -} #ifdef CONFIG_HPET static __init int late_hpet_init(void) @@ -433,3 +427,62 @@ static int __init nohpet_setup(char *s) __setup("nohpet", nohpet_setup); +#define HPET_MASK 0x +#define HPET_SHIFT 22 + +/* FSEC = 10^-15 NSEC = 10^-9 */ +#define FSEC_PER_NSEC 100 + +static void *hpet_ptr; + +static cycle_t read_hpet(void) +{ + return (cycle_t)readl(hpet_ptr); +} + +struct clocksource clocksource_hpet = { + .name = "hpet", + .rating = 250, + .read = read_hpet, + .mask = (cycle_t)HPET_MASK, + .mult = 0, /* set below */ + .shift = HPET_SHIFT, + .is_continuous = 1, +}; + +static int __init init_hpet_clocksource(void) +{ + unsigned long hpet_period; + void __iomem *hpet_base; + u64 tmp; + + if (!hpet_address) + return -ENODEV; + + /* calculate the hpet address: */ + hpet_base = + (void __iomem*)ioremap_nocache(hpet_address, HPET_MMAP_SIZE); + hpet_ptr = hpet_base + HPET_COUNTER; + + /* calculate the frequency: */ + hpet_period = readl(hpet_base + HPET_PERIOD); + + /* +* hpet period is in femto seconds per cycle +* so we need to convert this to ns/cyc units +* aproximated by mult/2^shift +* +* fsec/cyc * 1nsec/100fsec = nsec/cyc = mult/2^shift +* fsec/cyc * 1ns/100fsec * 2^shift = mult +* fsec/cyc * 2^shift * 1nsec/100fsec = mult +* (fsec/cyc << shift)/100 = mult +* (hpet_period << shift)/FSEC_PER_NSEC = mult +*/ + tmp = (u64)hpet_period << HPET_SHIFT; + do_div(tmp, FSEC_PER_NSEC); + clocksource_hpet.mult = (u32)tmp; + + return clocksource_register(_hpet); +} + +module_init(init_hpet_clocksource); diff --git a/arch/x86_64/kernel/pmtimer.c b/arch/x86_64/kernel/pmtimer.c index 7554458..ae8f912 100644 --- a/arch/x86_64/kernel/pmtimer.c +++ b/arch/x86_64/kernel/pmtimer.c @@ -24,15 +24,6 @@ #include #include #include -/* The I/O port the PMTMR resides at. - * The location is detected during setup_arch(), - * in arch/i386/kernel/acpi/boot.c */ -u32 pmtmr_ioport __read_mostly; - -/* value of the Power timer at last timer interrupt */ -static u32 offset_delay; -static u32 last_pmtmr_tick; - #define ACPI_PM_MASK 0xFF /* limit it to 24 bits */ static inline u32 cyc2us(u32 cycles) @@ -48,38 +39,6 @@ static inline u32 cyc2us(u32 cycles) return (cycles >> 10); } -int pmtimer_mark_offset(void) -{ - static int first_run = 1; -
[PATCH 3/5][time][x86_64] Split x86_64/kernel/time.c up
In preperation for the x86_64 generic time conversion, this patch splits out TSC and HPET related code from arch/x86_64/kernel/time.c into respective hpet.c and tsc.c files. Signed-off-by: John Stultz <[EMAIL PROTECTED]> arch/x86_64/kernel/Makefile |2 arch/x86_64/kernel/hpet.c | 435 ++ arch/x86_64/kernel/time.c | 628 arch/x86_64/kernel/tsc.c| 201 ++ include/asm-x86_64/hpet.h |6 include/asm-x86_64/timex.h | 11 6 files changed, 658 insertions(+), 625 deletions(-) linux-2.6.19-rc6git11_timeofday-arch-x86-64-split-hpet-tsc-time_C7.patch diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index 3c7cbff..e68a87e 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -8,7 +8,7 @@ obj-y := process.o signal.o entry.o trap ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_x86_64.o \ x8664_ksyms.o i387.o syscall.o vsyscall.o \ setup64.o bootflag.o e820.o reboot.o quirks.o i8237.o \ - pci-dma.o pci-nommu.o alternative.o + pci-dma.o pci-nommu.o alternative.o hpet.o tsc.o obj-$(CONFIG_STACKTRACE) += stacktrace.o obj-$(CONFIG_X86_MCE) += mce.o therm_throt.o diff --git a/arch/x86_64/kernel/hpet.c b/arch/x86_64/kernel/hpet.c new file mode 100644 index 000..a219786 --- /dev/null +++ b/arch/x86_64/kernel/hpet.c @@ -0,0 +1,435 @@ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +int nohpet __initdata = 0; + +unsigned long hpet_address; +unsigned long hpet_period; /* fsecs / HPET clock */ +unsigned long hpet_tick; /* HPET clocks / interrupt */ + +int hpet_use_timer;/* Use counter of hpet for time keeping, +* otherwise PIT +*/ +unsigned int do_gettimeoffset_hpet(void) +{ + /* cap counter read to one tick to avoid inconsistencies */ + unsigned long counter = hpet_readl(HPET_COUNTER) - vxtime.last; + return (min(counter,hpet_tick) * vxtime.quot) >> US_SCALE; +} + +#ifdef CONFIG_HPET +static __init int late_hpet_init(void) +{ + struct hpet_datahd; + unsigned intntimer; + + if (!hpet_address) + return 0; + + memset(, 0, sizeof (hd)); + + ntimer = hpet_readl(HPET_ID); + ntimer = (ntimer & HPET_ID_NUMBER) >> HPET_ID_NUMBER_SHIFT; + ntimer++; + + /* +* Register with driver. +* Timer0 and Timer1 is used by platform. +*/ + hd.hd_phys_address = hpet_address; + hd.hd_address = (void __iomem *)fix_to_virt(FIX_HPET_BASE); + hd.hd_nirqs = ntimer; + hd.hd_flags = HPET_DATA_PLATFORM; + hpet_reserve_timer(, 0); +#ifdef CONFIG_HPET_EMULATE_RTC + hpet_reserve_timer(, 1); +#endif + hd.hd_irq[0] = HPET_LEGACY_8254; + hd.hd_irq[1] = HPET_LEGACY_RTC; + if (ntimer > 2) { + struct hpet *hpet; + struct hpet_timer *timer; + int i; + + hpet = (struct hpet *) fix_to_virt(FIX_HPET_BASE); + timer = >hpet_timers[2]; + for (i = 2; i < ntimer; timer++, i++) + hd.hd_irq[i] = (timer->hpet_config & + Tn_INT_ROUTE_CNF_MASK) >> + Tn_INT_ROUTE_CNF_SHIFT; + + } + + hpet_alloc(); + return 0; +} +fs_initcall(late_hpet_init); +#endif + +int hpet_timer_stop_set_go(unsigned long tick) +{ + unsigned int cfg; + +/* + * Stop the timers and reset the main counter. + */ + + cfg = hpet_readl(HPET_CFG); + cfg &= ~(HPET_CFG_ENABLE | HPET_CFG_LEGACY); + hpet_writel(cfg, HPET_CFG); + hpet_writel(0, HPET_COUNTER); + hpet_writel(0, HPET_COUNTER + 4); + +/* + * Set up timer 0, as periodic with first interrupt to happen at hpet_tick, + * and period also hpet_tick. + */ + if (hpet_use_timer) { + hpet_writel(HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL | + HPET_TN_32BIT, HPET_T0_CFG); + hpet_writel(hpet_tick, HPET_T0_CMP); /* next interrupt */ + hpet_writel(hpet_tick, HPET_T0_CMP); /* period */ + cfg |= HPET_CFG_LEGACY; + } +/* + * Go! + */ + + cfg |= HPET_CFG_ENABLE; + hpet_writel(cfg, HPET_CFG); + + return 0; +} + +int hpet_arch_init(void) +{ + unsigned int id; + + if (!hpet_address) + return -1; + set_fixmap_nocache(FIX_HPET_BASE, hpet_address); + __set_fixmap(VSYSCALL_HPET, hpet_address, PAGE_KERNEL_VSYSCALL_NOCACHE); + +/* + * Read the period, compute tick and quotient. + */ + + id = hpet_readl(HPET_ID); + + if (!(id &
[PATCH 0/5][time][x86_64] GENERIC_TIME patchset for x86_64
Hey Andi, First let me apologize, I've been busy with other things and its been far too long since I last posted this. Anyway, I found some time to resync my trees and wanted to send this along. You had asked earlier about performance impact: Vanilla TSC: 149 nsecs per gtod call 367 nsecs per CLOCK_MONOTONIC call 288 nsecs per CLOCK_REALTIME call Vanilla ACPI PM: 1272 nsecs per gtod call 1335 nsecs per CLOCK_MONOTONIC call 1273 nsecs per CLOCK_REALTIME call GENERIC_TIME TSC: 149 nsecs per gtod call 304 nsecs per CLOCK_MONOTONIC call 275 nsecs per CLOCK_REALTIME call GENERIC_TIME ACPI PM: 1273 nsecs per gtod call 1275 nsecs per CLOCK_MONOTONIC call 1273 nsecs per CLOCK_REALTIME call So almost no performance change. Ingo has a few cleanups I need to merge, but otherwise I think this is getting close to ready for inclusion into -mm for testing. Please let me know if you have any major objections and if not I'll re-diff it against -mm and send it to Andrew. New in the current C7 release: o Synched up w/ 2.6.19-rc6-git11 o Reworked the patch order to be a bit more logical o Dropped the apic_runs_main_timer removal on Andi's request Let me know if you have any thoughts or comments! thanks again! -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2 -mm] fault-injection: lightweight code-coverage maximizer
On Tue, Nov 28, 2006 at 12:14:36PM -0800, Don Mullis wrote: > First, waiting a few seconds for the standard FC-6 daemons to wake up. > Then, Xemacs and Firefox. Not tested on SMP. Is it failslab or fail_page_alloc ? > > This doesn't maximize code coverage. It makes fault-injector reject > > any failures which have same stacktrace before. > > Since the volume of (repeated) dumps is greatly reduced, > interval/probability can be set more aggressively without crippling > interaction. This increases the number of error recovery paths covered > per unit of wall clock time. > It seems artificial. Injecting failures into slab or page allocator causes vastly greater range of errors and it should be. I feel what you really want is new fault capability. Fault injection is designed be extensible. It's not only for failslab, fail_page_alloc, and fail_make_request. If we want to inject errors into try_something() and have own tuning or setting, we just need to extend fault attribute and define own judging function, struct fail_try_something_attr { struct gorgeous_tuning tuning; struct fail_attr attr; } = fail_try_something = { .attr = FAULT_ATTR_INITIALIZER, }; static int should_fail_try_something(void *data) { if (tuning_did_clever_decision(_try_something.tuning, data)) return 0; return should_fail(_try_something.attr); } Then insert it into try_something() int try_something(void *data) { if (should_fail_try_something(data)) return 0; ... return 1; } Common debugfs entries for fault capabilities will be complicated soon by pushing new entries for every fault case or pattern. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] atl1: Revised Attansic L1 ethernet driver
Jay Cliburn wrote: I've been working on this with Jay since his initial submission. Thanks to everyone who has provided feedback on the resubmit. We're currently quite short on actual testers, since the chip only seems to be on Asus M2V motherboards at present. Please let me and Jay know if you have one of these boards and would like to test and/or have encountered bugs. I purchased a Asus P5B-E today which also has this network card, and would be interested in testing driver changes. Please email me directly, as I am not subscribed to the LKML yet. Thanks. -- Jonathan deBoer email: jonathanseltecabca - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
Krzysztof Halasa wrote: > Patrick McHardy <[EMAIL PROTECTED]> writes: > > >>It might be the case that your network device has a >>hard_header_len > LL_MAX_HEADER, which could trigger >>a corruption. > > > Hmm... GRE tunnels add 24 bytes... I just noticed the following code in > include/linux/netdevice.h: > > /* > * Compute the worst case header length according to the protocols > * used. > */ > #if !defined(CONFIG_NET_IPIP) && \ > !defined(CONFIG_IPV6) && !defined(CONFIG_IPV6_MODULE) > #define MAX_HEADER LL_MAX_HEADER > #else > #define MAX_HEADER (LL_MAX_HEADER + 48) > #endif > > I don't use AX25, Token Ring, the old IPIP tunnels nor IPv6 here, but > I wonder if GRE tunnel (which is basically another, more compatible > form of IPIP) need the same treatment as IPIP. Both ipip and gre do this: dev->hard_header_len= LL_MAX_HEADER + sizeof(struct iphdr); which explains it. It is a bug in the REJECT target, but I was wondering whether you were really seeing this. It looks like it makes sense to add GRE to the MAX_HEADER case above though. >>Please try this patch on top of the REJECT patch (ideally after >>verifying that the REJECT patch is really introducing the >>corruption). > > > That was certain. The patch fixed the problem, confirmed with current > git tree as well. Thanks for looking at it. Thanks. Dave, please apply this patch. [NETFILTER]: ipt_REJECT: fix memory corruption On devices with hard_header_len > LL_MAX_HEADER ip_route_me_harder() reallocates the skb, leading to memory corruption when using the stale tcph pointer to update the checksum. Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> diff --git a/net/ipv4/netfilter/ipt_REJECT.c b/net/ipv4/netfilter/ipt_REJECT.c index ad0312d..264763a 100644 --- a/net/ipv4/netfilter/ipt_REJECT.c +++ b/net/ipv4/netfilter/ipt_REJECT.c @@ -114,6 +114,14 @@ static void send_reset(struct sk_buff *o tcph->window = 0; tcph->urg_ptr = 0; + /* Adjust TCP checksum */ + tcph->check = 0; + tcph->check = tcp_v4_check(tcph, sizeof(struct tcphdr), + nskb->nh.iph->saddr, + nskb->nh.iph->daddr, + csum_partial((char *)tcph, + sizeof(struct tcphdr), 0)); + /* Set DF, id = 0 */ nskb->nh.iph->frag_off = htons(IP_DF); nskb->nh.iph->id = 0; @@ -129,14 +137,8 @@ #endif if (ip_route_me_harder(, addr_type)) goto free_nskb; - /* Adjust TCP checksum */ nskb->ip_summed = CHECKSUM_NONE; - tcph->check = 0; - tcph->check = tcp_v4_check(tcph, sizeof(struct tcphdr), - nskb->nh.iph->saddr, - nskb->nh.iph->daddr, - csum_partial((char *)tcph, - sizeof(struct tcphdr), 0)); + /* Adjust IP TTL */ nskb->nh.iph->ttl = dst_metric(nskb->dst, RTAX_HOPLIMIT);
[patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), wait_hpet_tick is optimized away to a never ending loop and the kernel hangs on boot in timer setup. 001a : 1a: 55 push %ebp 1b: 89 e5 mov%esp,%ebp 1d: eb fe jmp1d This is not a problem with gcc 3.3.5. Adding barrier() calls to wait_hpet_tick does not help, making the variables volatile does. Signed-off-by: Keith Owens --- arch/i386/kernel/time_hpet.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/arch/i386/kernel/time_hpet.c === --- linux-2.6.orig/arch/i386/kernel/time_hpet.c +++ linux-2.6/arch/i386/kernel/time_hpet.c @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, */ static void __devinit wait_hpet_tick(void) { - unsigned int start_cmp_val, end_cmp_val; + unsigned volatile int start_cmp_val, end_cmp_val; start_cmp_val = hpet_readl(HPET_T0_CMP); do { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
Patrick McHardy <[EMAIL PROTECTED]> writes: > It might be the case that your network device has a > hard_header_len > LL_MAX_HEADER, which could trigger > a corruption. Hmm... GRE tunnels add 24 bytes... I just noticed the following code in include/linux/netdevice.h: /* * Compute the worst case header length according to the protocols * used. */ #if !defined(CONFIG_AX25) && !defined(CONFIG_AX25_MODULE) && !defined(CONFIG_TR) #define LL_MAX_HEADER 32 #else #if defined(CONFIG_AX25) || defined(CONFIG_AX25_MODULE) #define LL_MAX_HEADER 96 #else #define LL_MAX_HEADER 48 #endif #endif #if !defined(CONFIG_NET_IPIP) && \ !defined(CONFIG_IPV6) && !defined(CONFIG_IPV6_MODULE) #define MAX_HEADER LL_MAX_HEADER #else #define MAX_HEADER (LL_MAX_HEADER + 48) #endif I don't use AX25, Token Ring, the old IPIP tunnels nor IPv6 here, but I wonder if GRE tunnel (which is basically another, more compatible form of IPIP) need the same treatment as IPIP. I've confirmed that REJECTs over GRE tunnel caused that corruption. > Please try this patch on top of the REJECT patch (ideally after > verifying that the REJECT patch is really introducing the > corruption). That was certain. The patch fixed the problem, confirmed with current git tree as well. Thanks for looking at it. I'm not sure about LL_MAX_HEADER (and/or MAX_HEADER) though. Should it be changed as well? There are many devices adding data to header space, perhaps tacking devices doesn't count as the skb is being linearized in dev->hard_start_xmit() or equivalent path? -- Krzysztof Halasa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.15.4 rel.2 1/1] libata: add hotswap to sata_svw
Benjamin Herrenschmidt wrote: On Tue, 2006-11-28 at 23:22 +, David Woodhouse wrote: On Thu, 2006-02-16 at 16:09 +0100, Martin Devera wrote: From: Martin Devera <[EMAIL PROTECTED]> Add hotswap capability to Serverworks/BroadCom SATA controlers. The controler has SIM register and it selects which bits in SATA_ERROR register fires interrupt. The solution hooks on COMWAKE (plug), PHYRDY change and 10B8B decode error (unplug) and calls into Lukasz's hotswap framework. The code got one day testing on dual core Athlon64 H8SSL Supermicro MoBo with HT-1000 SATA, SMP kernel and two CaviarRE SATA HDDs in hotswap bays. Signed-off-by: Martin Devera <[EMAIL PROTECTED]> What became of this? I might be to blame for not testing it... The Xserve I had on my desk was too noisy for most of my co-workers so I kept delaying and forgot about it Also the Xserve I have only has one disk, which makes hotplug testing a bit harder :-) Unfortunately my box with ht1000 is already deployed. Another similar one should arrive soon so that I'll retest it. Just now I've VIA based mobo here - and hotswap is NOT working with it .. Martin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6 : Spontaneous reboots, stack overflows - seems to implicate xfs, scsi, networking, SMP
On Thu, Nov 23, 2006 at 12:18:09PM +1100, David Chinner wrote: > On Wed, Nov 22, 2006 at 01:58:11PM +0100, Jesper Juhl wrote: > > > > Attached are two files. The one named stack_overflows.txt.gz contains > > one instance of each unique stack overflow + trace that I've got. The > > other file named kernel_BUG.txt.gz contains a few BUG() messages that > > were also in the logs. > I've just checked on a 2.6.17 build on i386 how much stack we > are using (from checkstack.pl with min size reported set to 32 bytes) > here in XFS: > So, assuming the stacks less than 32 bytes are 32 bytes, we've got > 1380 bytes in the XFS stack there, and very few functions where it > can be reduced further. Still, 1380 bytes is way, way short of 4KB, > so unless there is extra stack usage that checkstack doesn't tell us > about I'm not sure why this amount of usage is causing repeated > stack overflows with very little stack usage on either side of it. > > Can someone enlighten me as to where all the rest of the stack > is being used up here? FYI. With some help from Keith Owens, we've determined that gcc 3.3.5 resulted in XFS stack usage of about 1.9KB through the writeback and allocation path with another ~800 bytes of stack usage in generic code in this path. The big difference between the numbers I was getting from checkstack and reality was CONFIG_CC_OPTIMISE_FOR_SIZE=y being set on the kernels I was stack checking. IOWs, CONFIG_CC_OPTIMISE_FOR_SIZE=y appears to reduce XFS stack usage by at least 20% and so probably should be used with XFS on 4k stacks. Keith also confirmed that gcc-4.1's aggressive inlining of static functions substantially increases stack usage (by ~15%) through this call chain. Given that many of the inlined static functions are not required by the critical path (i.e. they'd previously been factored out to reduce stack usage), gcc is effectively undoing past mods that had substantially reduced XFS's stack usage. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][UPDATE] i2c: Add support for virtual I2C adapters
Is there a reason why the files and config options have been renamed from i2c-virtual to i2c-virt? On 4/7/06, Kumar Gala <[EMAIL PROTECTED]> wrote: Any comments or acceptance of this patch? - k On Mar 30, 2006, at 5:05 PM, Kumar Gala wrote: > Virtual adapters are useful to handle multiplexed I2C bus > topologies, by > presenting each multiplexed segment as a I2C adapter. Typically, > either > a mux (or switch) exists which is an I2C device on the parent bus. > One > selects a given child bus via programming the mux and then all the > devices > on that bus become present on the parent bus. The intent is to allow > multiple devices of the same type to exist in a system which would > normally > have address conflicts. > > Since virtual adapters will get registered in an I2C client's detect > function we have to expose versions of i2c_{add,del}_adapter for > i2c_{add,del}_virt_adapter to call that don't lock. > > Additionally, i2c_virt_master_xfer (and i2c_virt_smbus_xfer) acquire > the parent->bus_lock and call the parent's master_xfer directly. This > is because on a i2c_virt_master_xfer we have issue an i2c write on > the parent bus to select the given virtual adapter, then do the i2c > operation on the parent bus, followed by another i2c write on the > parent to deslect the virtual adapter. > > Signed-off-by: Kumar Gala <[EMAIL PROTECTED]> > > --- > commit 862cbc263e3d3e44028d7465a912847cf5366163 > tree 2c91bad8eb66cab9727f3071831a916ada41edf8 > parent 5d4fe2c1ce83c3e967ccc1ba3d580c1a5603a866 > author Kumar Gala <[EMAIL PROTECTED]> Thu, 30 Mar 2006 > 17:03:42 -0600 > committer Kumar Gala <[EMAIL PROTECTED]> Thu, 30 Mar 2006 > 17:03:42 -0600 > > drivers/i2c/Kconfig|9 ++ > drivers/i2c/Makefile |1 > drivers/i2c/i2c-core.c | 42 > drivers/i2c/i2c-virt.c | 173 + > +++ > include/linux/i2c-id.h |2 + > include/linux/i2c.h| 20 ++ > 6 files changed, 234 insertions(+), 13 deletions(-) > > diff --git a/drivers/i2c/Kconfig b/drivers/i2c/Kconfig > index 24383af..b8a8fc1 100644 > --- a/drivers/i2c/Kconfig > +++ b/drivers/i2c/Kconfig > @@ -34,6 +34,15 @@ config I2C_CHARDEV > This support is also available as a module. If so, the module > will be called i2c-dev. > > +config I2C_VIRT > + tristate "I2C virtual adapter support" > + depends on I2C > + help > + Say Y here if you want the I2C core to support the ability to have > + virtual adapters. Virtual adapters are useful to handle > multiplexed > + I2C bus topologies, by presenting each multiplexed segment as a > + I2C adapter. > + > source drivers/i2c/algos/Kconfig > source drivers/i2c/busses/Kconfig > source drivers/i2c/chips/Kconfig > diff --git a/drivers/i2c/Makefile b/drivers/i2c/Makefile > index 71c5a85..4467db2 100644 > --- a/drivers/i2c/Makefile > +++ b/drivers/i2c/Makefile > @@ -3,6 +3,7 @@ > # > > obj-$(CONFIG_I2C)+= i2c-core.o > +obj-$(CONFIG_I2C_VIRT) += i2c-virt.o > obj-$(CONFIG_I2C_CHARDEV)+= i2c-dev.o > obj-y+= busses/ chips/ algos/ > > diff --git a/drivers/i2c/i2c-core.c b/drivers/i2c/i2c-core.c > index 45e2cdf..64c1c9e 100644 > --- a/drivers/i2c/i2c-core.c > +++ b/drivers/i2c/i2c-core.c > @@ -150,22 +150,31 @@ static struct device_attribute dev_attr_ > */ > int i2c_add_adapter(struct i2c_adapter *adap) > { > + int res; > + > + mutex_lock(_lists); > + res = i2c_add_adapter_nolock(adap); > + mutex_unlock(_lists); > + > + return res; > +} > + > +int i2c_add_adapter_nolock(struct i2c_adapter *adap) > +{ > int id, res = 0; > struct list_head *item; > struct i2c_driver *driver; > > - mutex_lock(_lists); > - > if (idr_pre_get(_adapter_idr, GFP_KERNEL) == 0) { > res = -ENOMEM; > - goto out_unlock; > + goto out; > } > > res = idr_get_new(_adapter_idr, adap, ); > if (res < 0) { > if (res == -EAGAIN) > res = -ENOMEM; > - goto out_unlock; > + goto out; > } > > adap->nr = id & MAX_ID_MASK; > @@ -203,21 +212,29 @@ int i2c_add_adapter(struct i2c_adapter * > driver->attach_adapter(adap); > } > > -out_unlock: > - mutex_unlock(_lists); > +out: > return res; > } > > - > int i2c_del_adapter(struct i2c_adapter *adap) > { > + int res; > + > + mutex_lock(_lists); > + res = i2c_del_adapter_nolock(adap); > + mutex_unlock(_lists); > + > + return res; > +} > + > +int i2c_del_adapter_nolock(struct i2c_adapter *adap) > +{ > struct list_head *item, *_n; > struct i2c_adapter *adap_from_list; > struct i2c_driver *driver; > struct i2c_client *client; > int res = 0; > > - mutex_lock(_lists); > > /* First make sure that this adapter was ever added */ >
Re: failed 'ljmp' in linear addressing mode
On Tue, Nov 28, 2006 at 06:49:17PM -0500, linux-os (Dick Johnson) wrote: > > On Tue, 28 Nov 2006, Jun Sun wrote: > > > On Tue, Nov 28, 2006 at 08:46:44AM -0500, linux-os (Dick Johnson) wrote: > >> > >> On Mon, 27 Nov 2006, Jun Sun wrote: > >> > >>> > >>> On Mon, Nov 27, 2006 at 08:58:57AM -0500, linux-os (Dick Johnson) wrote: > > I think it probably resets the instant that you turn off paging. To > turn off paging, you need to copy some code (properly linked) to an > area where there is a 1:1 mapping between virtual and physical addresses. > A safe place is somewhere below 1 megabyte. Then you need to set up a > call descriptor so you can call that code (you can ljump if you never > plan to get back). You then need to clear interrupts on all CPUs (use a > spin-lock). Once you are executing from the new area, you reset your > segments to the new area. The call descriptor would have already set > CS, as would have the long-jump. At this time you can turn off paging > and flush the TLB. You are now in linear-address protected mode. > > >>> > >>> Thanks for the reply. But I am pretty much sure I did above correctly. > >>> I use single-instruction infinite loop in the call path to verify > >>> that control does reach last 'ljmp' but not the jump destination. > >>> > >>> Below is the hack I made to machine_kexec.c file. As you can see, I > >>> managed to make the identical mapping between virtual and physical > >>> addresses. > >>> > >>> Note I did not copy the code into the first 1M. In fact the code > >>> is located at 0xc0477000 (0x00477000 in physical). I thought that should > >>> be > >>> OK as I did not really go all the way back to real-address mode. > >>> > >>> That last suspect I have now is the wrong value in CS descriptor. Does > >>> kernel > >>> have a suitable CS descriptor for the last ljmep to 0x1000 in linear > >>> addressing mode? The CS descriptor seems to be a pretty dark magic to me > >>> ... > >>> > >>> Cheers. > >>> > >>> Jun > >>> > >>> - > >>> diff -Nru linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c.orig > >>> linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c > >>> --- linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c.orig > >>> 2006-10-13 11:55:04.0 -0700 > >>> +++ linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c > >>> 2006-11-22 15:01:45.0 -0800 > >>> @@ -212,3 +212,19 @@ > >>>rnk = (relocate_new_kernel_t) reboot_code_buffer; > >>>(*rnk)(page_list, reboot_code_buffer, image->start, cpu_has_pae); > >>> } > >>> + > >>> +extern void do_os_switching(void); > >>> +void os_switch(void) > >>> +{ > >>> + void (*foo)(void); > >>> + > >>> + /* absolutely no irq */ > >>> + local_irq_disable(); > >>> + > >>> + /* create identity mapping */ > >>> + foo=virt_to_phys(do_os_switching); > >>> + identity_map_page((unsigned long)foo); > >>> + > >>> + /* jump to the real address */ > >>> + foo(); > >>> +} > >>> > >> Get a copy of the Intel 486 Microprocessor Reference Manual or read it on- > >> line. There is no way that you can make a call like that. > > > > By "a call like that", you mean "foo()"? Are you sure about that? > > > > The machine_kexec() function in the same file is basically doing the > > same way (i.e., use "call *$eax" instead of "ljmp"). That is where I got > > my idea from. > > > > In addition, if I put "1: jmp 1b" instruction anywhere *inside* > > do_os_switching() I would get infinite hanging instead of reboot, > > which seems to suggest I *did* jump into do_os_switching() successfully. > > > > According to Intel Architecture Software Developer's Manual (1997), Vol 3, > > page 8-14: > > > > "2. If paging is enabled perform the following operations: > > > > - Transfer program control to linear addresses that are identity mapped to > >physical addresses (that is, linear addresses equal physical addresses) > > ... > > " > > > > it does not indicate one has to use "ljmp" to do this control transfer. > > Assume you are accessing memory at 0xc000-. This address, when > page translation is occurring (page 5-17), consists of three parts. > > (1) A 12-bit offset 0:11 > (2) A 10-bit index 11:21 > (3) A 10-bit index 21:31 > > So 0xc00 is an index into the page directory. If you wish to turn off > translation, you can't just turn off those bits. The next instruction > will be fetched from memory with the page-cache upper bits reset, i.e, > using offset 0 of the page directory. You somehow need to turn off those > bits at the same time the next instruction is fetched. Normally you > use a call gate. However, you can do a long jump which reloads the > segment register. When the instruction book says "transfer control" > it doesn't mean just jump to some offset. When the instruction address is > 0xC000-, it is not the same as 0x-. These two addresses are > different (to
Re: XFS internal error xfs_trans_cancel at line 1138 of file fs/xfs/xfs_trans.c (kernel 2.6.18.1)
On Tue, Nov 28, 2006 at 04:49:00PM +0100, Jesper Juhl wrote: > Hi, > > One of my NFS servers just gave me a nasty surprise that I think it is > relevant to tell you about: Thanks, Jesper. > Filesystem "dm-1": XFS internal error xfs_trans_cancel at line 1138 of > file fs/xfs/xfs_trans.c. Caller 0x8034b47e > > Call Trace: > [] show_trace+0xb2/0x380 > [] dump_stack+0x15/0x20 > [] xfs_error_report+0x3c/0x50 > [] xfs_trans_cancel+0x6e/0x130 > [] xfs_create+0x5ee/0x6a0 > [] xfs_vn_mknod+0x156/0x2e0 > [] xfs_vn_create+0xb/0x10 > [] vfs_create+0x8c/0xd0 > [] nfsd_create_v3+0x31a/0x560 > [] nfsd3_proc_create+0x148/0x170 > [] nfsd_dispatch+0xf9/0x1e0 > [] svc_process+0x437/0x6e0 > [] nfsd+0x1cd/0x360 > [] child_rip+0xa/0x12 > xfs_force_shutdown(dm-1,0x8) called from line 1139 of file > fs/xfs/xfs_trans.c. Return address = 0x80359daa We shut down the filesystem because we cancelled a dirty transaction. Once we start to dirty the incore objects, we can't roll back to an unchanged state if a subsequent fatal error occurs during the transaction and we have to abort it. If I understand historic occurrences of this correctly, there is a possibility that it can be triggered in ENOMEM situations. Was your machine running out of memoy when this occurred? > Filesystem "dm-1": Corruption of in-memory data detected. Shutting > down filesystem: dm-1 > Please umount the filesystem, and rectify the problem(s) > nfsd: non-standard errno: 5 EIO gets returned in certain locations once the filesystem has been shutdown. > I unmounted the filesystem, ran xfs_repair which told me to try an > mount it first to replay the log, so I did, unmounted it again, ran > xfs_repair (which didn't find any problems) and finally mounted it and > everything is good - the filesystem seems intact. Yeah, the above error report typically is due to an in-memory problem, not an on disk issue. > The server in question is running kernel 2.6.18.1 Can happen to XFS on any kernel version - got a report of this from someone running a 2.4 kernel a couple of weeks ago Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.18 tsc clocksource + ntp = excessive drift; acpi_pm does fine.
john stultz escreveu: On Tue, 2006-11-28 at 21:46 -0200, Alexandre Pereira Nunes wrote: Hi, with default boot I got tsc clocksource selected on an debian's 2.6.18-3-k7 SMP build (but UP machine). ntp keeps bothering me with this message: frequency error 512 PPM exceeds tolerance 500 PPM Hmmm. Could you send me your dmesg? Also what frequency is your cpu? Sure, attached! You'll notice an "acpi_pm installed" or something at the end, that was at the time I typed the echo acpi_pm >/sys/whatever. My cpu is an athlon xp 2600+, I attached a copy of /proc/cpuinfo for convenience. Also does booting w/ "noapic" change the behavior? I'll test it and let you know. I also read (but didn't try) about some "notsc" option, I assume that's not a good one to try, right? [cut] If I remove ntp's drift file, then do a: echo acpi_pm >/sys/devices/system/clocksource/clocksource0/available_clocksource ; I think you mean "current_clocksource" there... Ooops. Let's just pretend no one else saw that! :-) [cut] Yea, its likely the generic timekeeping changes for i386. Previously (pre-2.6.18) it probably defaulted to the acpi pm timer and was fine. The new code is a bit more aggressive in trying to use the TSC. Just out of curiousity: what about this acpi_pm stuff ... Reading from tsc is probably cheaper than any other "accurate" clock source, but how bad (or good) is acpi_pm? As a short term workaround, you can put "clocksource=acpi_pm" on your grub line and that will force the clocksource at boot. Yeah, I googled around and had put that on grub's config, but didn't reboot. I'll swap that with noapic and reboot, by tomorrow I should have some news. - Alexandre Linux version 2.6.18-3-k7 (Debian 2.6.18-6) ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)) #1 SMP Thu Nov 23 21:37:22 UTC 2006 BIOS-provided physical RAM map: BIOS-e820: - 0009fc00 (usable) BIOS-e820: 0009fc00 - 000a (reserved) BIOS-e820: 000d - 000d6000 (reserved) BIOS-e820: 000f - 0010 (reserved) BIOS-e820: 0010 - 1fff (usable) BIOS-e820: 1fff - 1fff8000 (ACPI data) BIOS-e820: 1fff8000 - 2000 (ACPI NVS) BIOS-e820: fec0 - fec01000 (reserved) BIOS-e820: fee0 - fee01000 (reserved) BIOS-e820: fff8 - 0001 (reserved) 0MB HIGHMEM available. 511MB LOWMEM available. On node 0 totalpages: 131056 DMA zone: 4096 pages, LIFO batch:0 Normal zone: 126960 pages, LIFO batch:31 DMI 2.3 present. ACPI: RSDP (v000 AMI ) @ 0x000fa8a0 ACPI: RSDT (v001 AMIINT VIA_K7 0x0010 MSFT 0x0097) @ 0x1fff ACPI: FADT (v001 AMIINT VIA_K7 0x0011 MSFT 0x0097) @ 0x1fff0030 ACPI: MADT (v001 AMIINT VIA_K7 0x0009 MSFT 0x0097) @ 0x1fff00c0 ACPI: DSDT (v001VIA KT266A 0x1000 MSFT 0x010d) @ 0x ACPI: PM-Timer IO Port: 0x808 ACPI: Local APIC address 0xfee0 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 6:8 APIC version 16 ACPI: IOAPIC (id[0x02] address[0xfec0] gsi_base[0]) IOAPIC[0]: apic_id 2, version 3, address 0xfec0, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Enabling APIC mode: Flat. Using 1 I/O APICs Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at 3000 (gap: 2000:dec0) Detected 2133.046 MHz processor. Built 1 zonelists. Total pages: 131056 Kernel command line: root=/dev/hda2 ro mapped APIC to d000 (fee0) mapped IOAPIC to c000 (fec0) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 2048 (order: 11, 8192 bytes) Console: colour VGA+ 80x25 Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) Memory: 515356k/524224k available (1556k kernel code, 8332k reserved, 582k data, 196k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 4270.91 BogoMIPS (lpj=8541825) Security Framework v1.0.0 initialized SELinux: Disabled at boot. Capability LSM initialized Mount-cache hash table entries: 512 CPU: After generic identify, caps: 0383fbff c1c3fbff CPU: After vendor identify, caps: 0383fbff c1c3fbff CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 256K (64 bytes/line) CPU: After all inits, caps: 0383fbff c1c3fbff 0420 Intel machine
Re: [rfc PATCH] ieee1394: ohci1394: delete bogus spinlock, flush MMIO writes
On Wed, 29 Nov 2006 00:50:43 +0100 Stefan Richter <[EMAIL PROTECTED]> wrote: > Alan wrote: > > On Tue, 28 Nov 2006 22:24:11 +0100 (CET) > > Stefan Richter <[EMAIL PROTECTED]> wrote: > >> All MMIO writes which were surrounded by the spinlock as well as the > >> very last MMIO write of the IRQ handler are now explicitly flushed by > >> MMIO reads of the respective register. > > > > MMIO is ordered anyway on the bus, you just need mmiowb() to force > > ordering to the bus controller in case you are on a big numa box. > > The mmiowb is a checkpoint to ensure ordering between different threads > of MMIO writes; i.e. it doesn't halt the thread until the write actually > reached the device like a read would do, right? It guarantees that no other mmio will sneak past it from another thread but doesn't guarantee the previous I/O has hit the hardware. It's a much weaker (and thus far faster) guarantee which is usually sufficient as it can be combined with spin_unlock to enforce I/O ordering matching the lock ordering. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-rt8
On Mon, 27 Nov 2006 10:49:27 +0100 Ingo Molnar <[EMAIL PROTECTED]> wrote: > i have released the 2.6.19-rc6-rt8 tree, which can be downloaded from > the usual place: > > http://redhat.com/~mingo/realtime-preempt/ attached patch to making it compile and works in my PowerBook G4. Index: linux-2.6.19-rc6-rt5/arch/powerpc/kernel/time.c === --- linux-2.6.19-rc6-rt5.orig/arch/powerpc/kernel/time.c2006-11-28 22:13:54.0 + +++ linux-2.6.19-rc6-rt5/arch/powerpc/kernel/time.c 2006-11-28 22:15:48.0 + @@ -507,7 +507,7 @@ if (per_cpu(last_jiffy, cpu) >= tb_next_jiffy) { tb_last_jiffy = tb_next_jiffy; do_timer(1); - timer_recalc_offset(tb_last_jiffy); + /*timer_recalc_offset(tb_last_jiffy);*/ timer_check_rtc(); } write_sequnlock(_lock); Index: linux-2.6.19-rc6-rt5/include/asm-powerpc/semaphore.h === --- linux-2.6.19-rc6-rt5.orig/include/asm-powerpc/semaphore.h 2006-11-28 22:13:54.0 + +++ linux-2.6.19-rc6-rt5/include/asm-powerpc/semaphore.h2006-11-28 22:15:48.0 + @@ -10,7 +10,7 @@ #ifdef __KERNEL__ -#include +/*#include */ #include #include #include Index: linux-2.6.19-rc6-rt5/mm/page_alloc.c === --- linux-2.6.19-rc6-rt5.orig/mm/page_alloc.c 2006-11-28 22:13:54.0 + +++ linux-2.6.19-rc6-rt5/mm/page_alloc.c2006-11-28 22:15:48.0 + @@ -2800,7 +2800,9 @@ void __init page_alloc_init(void) { +#ifdef CONFIG_HOTPLUG_CPU hotcpu_notifier(page_alloc_cpu_notify, 0); +#endif } /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-mm2
On Tue, Nov 28, 2006 at 04:58:28PM -0800, Andrew Morton wrote: > On Tue, 28 Nov 2006 19:24:45 -0500 > Thomas Tuttle <[EMAIL PROTECTED]> wrote: > > > 2. I'm not sure if this bug is in the kernel, wireless tools, or the > > ipw3945 driver, but I haven't changed the version of anything but the > > kernel. When I do `iwconfig eth1 essid foobar' something drops the > > last character of the essid, and a subsequent `iwconfig eth1' shows > > "fooba" as the essid. And it's actually set as "fooba", since I had > > to do `iwconfig eth1 essid MyUsualEssid_' (note underscore) to get on > > to my usual network. > > This could be version skew between the wireless APIs in the kernel.org kernel, > the wireless userspace, the out-of-tree ipw3945 driver and conceivably one > of the git trees in -mm (although I suspect not the latter). > > I don't know, but I know who to cc ;) Probably they will want to knwo which > version of wireless-tools userspace you are running. Yes, it's a problem because the driver is out-of-tree. I sent a patch to the maintainer to make the driver compatible with kernel before/after, and it's actually integrated in the version 1.1.2 of the driver (Nov 1st). So, please upgrade your driver and tell us how it works... Jean - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Don't compare unsigned variable for <0 in sys_prctl()
On 29/11/06, Linus Torvalds <[EMAIL PROTECTED]> wrote: On Wed, 29 Nov 2006, Jesper Juhl wrote: > > I would venture that "-Wshadow" is another one of those. I'd agree, except for the fact that gcc does a horribly _bad_ job of -Wshadow, making it (again) totally unusable. For example, it's often entirely interesting to hear about local variables that shadow each other. No question about it. HOWEVER. It's _not_ really interesting to hear about a local variable that happens to have a common name that is also shared by a extern function. There just isn't any room for confusion, and it's actually not even that unusual - I tried using -Wshadow on real programs, and it was just horribly irritating. In the kernel, we had obvious things like local use of "jiffies" that just make _total_ sense in a small inline function, and the fact that there happens to be an extern declaration for "jiffies" just isn't very interesting. Similarly, with nested macro expansion, even the "local variable shadows another local variable" case - that looks like it should have an obvious warning on the face of it - really isn't always necessarily that interesting after all. Maybe it is a bug, maybe it isn't, but it's no longer _obviously_ bogus any more. So I'm not convinced about the usefulness of "-Wshadow". ESPECIALLY the way that gcc implements it, it's almost totally useless in real life. For example, I tried it on "git" one time, and this is a perfect example of why "-Wshadow" is totally broken: diff-delta.c: In function 'create_delta_index': diff-delta.c:142: warning: declaration of 'index' shadows a global declaration (and there's a _lot_ of those). If I'm not allowed to use "index" as a local variable and include at the same time, something is simply SERIOUSLY WRONG with the warning. So the fact is, the C language has scoping rules for a reason. Can you screw yourself by usign them badly? Sure. But that does NOT mean that the same name in different scopes is a bad thing that should be warned about. If I wanted a language that didn't allow me to do anything wrong, I'd be using Pascal. As it is, it turns out that things that "look" wrong on a local level are often not wrong after all. I can't really say anything else at this point but, point conceded... -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/16] LTTng 0.6.36 for 2.6.18 : Linux Kernel Markers
Hi - On Tue, Nov 28, 2006 at 05:40:36AM +, Christoph Hellwig wrote: > [...] > > > Are you sure the license_gplok check is necessary here? We should > > > consider encouraging non-gpl module writers to instrument their code, > > > to give users a slightly better chance of debugging problems. > > [... the authors of clearcase] have the funny habit of > > distributing their kernel modules as ".ko" files instead of > > sending a proper ".o" and later link it against a wrapper. The > > result is, I must say, quite bad [...] the structure is > > corrupted. > Please don't add hacks like that for non-GPL modules. Indeed, offline Matheiu elaborated on his problem, and it turns out that good old modversions would have solved it. > But neither should we export any tracing functionality for them. > They're not the kind of people we want to help at all, Making that sort of political decision is beyond my pay grade. I merely suggested its consideration. > and Frank just shows once again that he should rather stay away from > kernel stuff and keep on writing C++. Now now, if you don't like my C++, wait till you see my Smalltalk-80. Or are you just jealous that my initials subsume yours? - FChE (this space for rent) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-mm2
On Tue, 28 Nov 2006, Andrew Morton wrote: > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc6/2.6.19-rc6-mm2/ md-change-lifetime-rules-for-md-devices.patch gives me the following early during boot (first WARNING() inside __mutex_lock_slowpath(), then BUG at __mutex_lock_slowpath(), just after that slab corruption). When I revert md-change-lifetime-rules-for-md-devices.patch, everything seems to go fine (this machine does use neither LVM nor RAID, but the kernel has DM compiled in). Config is at http://www.jikos.cz/jikos/junk/.config_md WARNING at kernel/mutex.c:132 __mutex_lock_common() [] dump_trace+0x68/0x1b5 [] show_trace_log_lvl+0x18/0x2c [] show_trace+0xf/0x11 [] dump_stack+0x12/0x14 [] __mutex_lock_slowpath+0xa1/0x213 [] create_dir+0x24/0x1ba [] sysfs_create_dir+0x45/0x5f [] kobject_add+0xce/0x185 [] kobject_register+0x19/0x30 [] md_probe+0x11a/0x124 [] kobj_lookup+0xe6/0x122 [] get_gendisk+0xe/0x1b [] do_open+0x2e/0x298 [] blkdev_open+0x25/0x4d [] __dentry_open+0xc3/0x17e [] nameidata_to_filp+0x24/0x33 [] do_filp_open+0x32/0x39 [] do_sys_open+0x3a/0x66 [] sys_open+0x1c/0x1e [] syscall_call+0x7/0xb DWARF2 unwinder stuck at syscall_call+0x7/0xb Leftover inexact backtrace: === BUG: unable to handle kernel paging request at virtual address 6b6b6b6b printing eip: c01fc5ab *pde = Oops: [#1] SMP last sysfs file: /class/input/input5/event5/dev Modules linked in: video sony_acpi button battery backlight ac ipv6 floppy i2c_viapro i2c_core snd_via82xx gameport snd_ac97_codec snd_ac97_bus snd_seq_dummy via_rhine snd_seq_oss snd_seq_midi_event snd_seq mii snd_pcm_oss snd_mixer_oss snd_pcm pcspkr snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore serio_raw ehci_hcd ohci_hcd uhci_hcd CPU:0 EIP:0060:[]Not tainted VLI EFLAGS: 00010046 (2.6.19-rc6-mm2 #1) EIP is at __list_add+0x2a/0x5c eax: 6b6b6b6b ebx: edee9de0 ecx: eb8c34d8 edx: 6b6b6b6b esi: eb8c34b8 edi: 0246 ebp: ef60a050 esp: edee9db4 ds: 007b es: 007b ss: 0068 Process nash (pid: 1321, ti=edee8000 task=ef60a050 task.ti=edee8000) Stack: 0001 c0197c7d edee9de0 edee9de0 edee9de0 eb8c34b8 c036e703 0002 c0197c7d c03752fd edee9de0 edee9de0 eb8c34b8 edee9de0 eb882cac ffea eb882cac edee9e30 c0197c7d ef60a5a0 ee8d3404 Call Trace: [] __mutex_lock_slowpath+0xea/0x213 [] create_dir+0x24/0x1ba [] sysfs_create_dir+0x45/0x5f [] kobject_add+0xce/0x185 [] kobject_register+0x19/0x30 [] md_probe+0x11a/0x124 [] kobj_lookup+0xe6/0x122 [] get_gendisk+0xe/0x1b [] do_open+0x2e/0x298 [] blkdev_open+0x25/0x4d [] __dentry_open+0xc3/0x17e [] nameidata_to_filp+0x24/0x33 [] do_filp_open+0x32/0x39 [] do_sys_open+0x3a/0x66 [] sys_open+0x1c/0x1e [] syscall_call+0x7/0xb DWARF2 unwinder stuck at syscall_call+0x7/0xb Leftover inexact backtrace: === no locks held by nash/1321. Code: c3 56 53 89 c3 83 ec 10 8b 41 04 39 d0 74 1c 89 4c 24 0c 89 54 24 04 89 44 24 08 c7 04 24 80 94 3a c0 e8 be f9 f1 ff 0f 0b eb fe <8b> 32 39 ce 74 1c 89 54 24 0c 89 74 24 08 89 4c 24 04 c7 04 24 EIP: [] __list_add+0x2a/0x5c SS:ESP 0068:edee9db4 <3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20 in_atomic():0, irqs_disabled():1 no locks held by nash/1321. [] dump_trace+0x68/0x1b5 [] show_trace_log_lvl+0x18/0x2c [] show_trace+0xf/0x11 [] dump_stack+0x12/0x14 [] down_read+0x15/0x4e [] __blocking_notifier_call_chain+0x11/0x3d [] blocking_notifier_call_chain+0x17/0x1a [] do_exit+0x19/0x782 [] die+0x20c/0x231 [] do_page_fault+0x450/0x51e [] error_code+0x7c/0x84 DWARF2 unwinder stuck at error_code+0x7c/0x84 Leftover inexact backtrace: [] __list_add+0x2a/0x5c [] create_dir+0x24/0x1ba [] __mutex_lock_slowpath+0xea/0x213 [] create_dir+0x24/0x1ba [] create_dir+0x24/0x1ba [] sysfs_create_dir+0x45/0x5f [] kobject_add+0xce/0x185 [] init_waitqueue_head+0x12/0x20 [] kobject_init+0x5b/0x7d [] kobject_register+0x19/0x30 [] md_probe+0x11a/0x124 [] kobj_lookup+0xe6/0x122 [] md_probe+0x0/0x124 [] blkdev_open+0x0/0x4d [] get_gendisk+0xe/0x1b [] do_open+0x2e/0x298 [] blkdev_open+0x0/0x4d [] blkdev_open+0x0/0x4d [] blkdev_open+0x25/0x4d [] __dentry_open+0xc3/0x17e [] nameidata_to_filp+0x24/0x33 [] do_filp_open+0x32/0x39 [] get_unused_fd+0xaa/0xb4 [] _spin_unlock+0x14/0x1c [] get_unused_fd+0xaa/0xb4 [] do_sys_open+0x3a/0x66 [] sys_open+0x1c/0x1e [] syscall_call+0x7/0xb === Slab corruption: start=eb8c3428, len=488 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [](iput+0x60/0x62) 090: 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Single bit error detected. Probably bad RAM. Run memtest86+ or a similar memory test tool. Prev obj: start=eb8c3234, len=488 Redzone: 0x5a2cf071/0x5a2cf071. Last user:
Re: 2.6.18 tsc clocksource + ntp = excessive drift; acpi_pm does fine.
On Tue, 2006-11-28 at 21:46 -0200, Alexandre Pereira Nunes wrote: > Hi, > > with default boot I got tsc clocksource selected on an debian's > 2.6.18-3-k7 SMP build (but UP machine). ntp keeps bothering me with this > message: > frequency error 512 PPM exceeds tolerance 500 PPM Hmmm. Could you send me your dmesg? Also what frequency is your cpu? Also does booting w/ "noapic" change the behavior? > If I remove ntp's drift file and restart, it goes fine for a while and > then it goes with that behaviour again. > If I remove ntp's drift file, then do a: echo acpi_pm > >/sys/devices/system/clocksource/clocksource0/available_clocksource ; I think you mean "current_clocksource" there... > and then restart ntp, it goes fine "forever". > > Any toughs, something I should look at? > > I'll be glad to give more feedback. > > I don't know if that happened with 2.6.17, but I'm pretty sure that with > 2.6.16 it was fine. Yea, its likely the generic timekeeping changes for i386. Previously (pre-2.6.18) it probably defaulted to the acpi pm timer and was fine. The new code is a bit more aggressive in trying to use the TSC. As a short term workaround, you can put "clocksource=acpi_pm" on your grub line and that will force the clocksource at boot. thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-mm2
On Tue, 28 Nov 2006 19:24:45 -0500 Thomas Tuttle <[EMAIL PROTECTED]> wrote: > 2. I'm not sure if this bug is in the kernel, wireless tools, or the > ipw3945 driver, but I haven't changed the version of anything but the > kernel. When I do `iwconfig eth1 essid foobar' something drops the > last character of the essid, and a subsequent `iwconfig eth1' shows > "fooba" as the essid. And it's actually set as "fooba", since I had > to do `iwconfig eth1 essid MyUsualEssid_' (note underscore) to get on > to my usual network. This could be version skew between the wireless APIs in the kernel.org kernel, the wireless userspace, the out-of-tree ipw3945 driver and conceivably one of the git trees in -mm (although I suspect not the latter). I don't know, but I know who to cc ;) Probably they will want to knwo which version of wireless-tools userspace you are running. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
isochronous receives?
Keith, et. al, I am having problems with isochronous receives, and remembered just as I was getting ready to dig into the source that there was a message about this stuff. Lo and behold your message to linux1394-user from September 7: I'm trying to receive isochronous streams (using libraw1394 1.2.0), and I've noticed that if data is transmitted on channel 63, then my app tends to work fine. If the stream is on a different channel, then I don't see any isochronous packets at all. I'm using 2.4.29, I've also tried 2.6.15 with similar results, can't seem to receive channels < 63. Did you ultimately have any success getting this going? Funnily enough, when I tested isochronous stuff in July, I just did iso transmit since I figured receives *must* be working since everyone has camcorders and whatnot. My currently my iso xmit stuff does appear to be working, but iso receives are not. I have a Firespy and no reason not to trust it, so I can see the junk I'm spewing out. I've tried transmitting on channels 4 and 63 (per your advice), but neither works for me. I suppose it could my stuff... nah. -- Robert Crocombe [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-mm2
On Tue, 28 Nov 2006 19:24:45 -0500 Thomas Tuttle <[EMAIL PROTECTED]> wrote: > I've found a couple of bugs so far... > > 1. I did `modprobe kvm' and then tried running a version of the KVM Qemu > compiled for a different kernel. My mistake. But I got an oops: > > BUG: unable to handle kernel NULL pointer dereference at virtual address > 0008 > printing eip: > f91f9c3f > *pde = > Oops: [#1] > SMP > last sysfs file: /devices/system/cpu/cpu0/cpufreq/scaling_max_freq > Modules linked in: kvm iTCO_wdt i8k rfcomm l2cap rtc sdhci mmc_block mmc_core > hci_usb bluetooth b44 mii ohci1394 ieee1394 uhci_hcd ehci_hcd usbcore psmouse > evdev i915 drm cpuid msr speedstep_centrino video thermal processor fan > container button battery ac > CPU:0 > EIP:0060:[]Not tainted VLI > EFLAGS: 00010202 (2.6.19-rc6-mm1 #1) > EIP is at kvm_vmx_return+0xef/0x4d0 [kvm] > eax: e5490068 ebx: ecx: edx: e5491ca4 > esi: edi: e5490060 ebp: e5a4fde0 esp: e5a4fd54 > ds: 007b es: 007b ss: 0068 > Process qemu (pid: 24193, ti=e5a4e000 task=c2286a90 task.ti=e5a4e000) > Stack: 0002 0001 f7fe1278 0002 b7f92000 e549 > >e5a4fdac 00d8 f783a580 e5a4fdac c043b98a bfb93f7c > f91fa020 >e5a4fde0 bfb93f7c bfb93f7c f91fa0cb 04f3 c03fb974 e549 > > Call Trace: > [] kvm_dev_ioctl+0x0/0x1040 [kvm] > [] kvm_dev_ioctl+0xab/0x1040 [kvm] > [] error_code+0x7c/0x84 > [] kmap_atomic+0xc9/0xe0 > [] permission+0x2b/0xd0 > [] sys_swapon+0x978/0xaf0 > [] kunmap_atomic+0x63/0x70 > [] kmap_atomic+0xc9/0xe0 > [] kunmap_atomic+0x63/0x70 > [] get_page_from_freelist+0x27d/0x340 > [] kmap_atomic+0xc9/0xe0 > [] kunmap_atomic+0x63/0x70 > [] get_page_from_freelist+0x27d/0x340 > [] find_get_page+0x20/0x60 > [] filemap_nopage+0x2dc/0x490 > [] do_sync_read+0xc7/0x110 > [] kmap_atomic+0xc9/0xe0 > [] kunmap_atomic+0x63/0x70 > [] __handle_mm_fault+0x246/0x9c0 > [] kvm_dev_ioctl+0x0/0x1040 [kvm] > [] scsi_host_alloc+0x202/0x2a0 > [] do_ioctl+0x2b/0x90 > [] vfs_ioctl+0x5c/0x2b0 > [] sys_ioctl+0x3d/0x70 > [] syscall_call+0x7/0xb > [] scsi_host_alloc+0x202/0x2a0 > === > Code: 14 0f 87 77 02 00 00 8b 0c b5 00 15 20 f9 85 c9 0f 84 68 02 00 00 89 ea > 89 f8 ff d1 85 c0 0f 84 4c 02 00 00 89 f8 e8 31 e9 ff ff <65> a1 08 00 00 00 > 8b 40 04 8b 40 08 a8 04 0f 85 ae 02 00 00 e8 > EIP: [] kvm_vmx_return+0xef/0x4d0 [kvm] SS:ESP 0068:e5a4fd54 > msrs: 2 > > Oh, and I get a ton of these messages with kvm: > > rtc: lost some interrupts at 1024Hz. KVM culprits cc'ed. The KVM patches are I got them didn't even compile on i386, so runtime breakage isn't very surprising. Looks like you need an x86_64 machine ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] lib + ntfs: let modules force HWEIGHT
Andrew Morton wrote: On Tue, 28 Nov 2006 14:08:40 -0800 Randy Dunlap <[EMAIL PROTECTED]> wrote: From: Randy Dunlap <[EMAIL PROTECTED]> NTFS (=m) uses hweight32(), but that function is only linked into the kernel image if it is used inside the kernel image, not in loadable modules. Let modules force HWEIGHT to be built into the kernel image. Otherwise build fails: Building modules, stage 2. MODPOST 94 modules WARNING: "hweight32" [fs/ntfs/ntfs.ko] undefined! Yes, I'd certainly prefer for this to be more automated rather than forced by each module that needs it. Perhaps we should just put it in lib-y and remove CONFIG_GENERIC_HWEIGHT. It's either part of the API or it ain't. Yes, that matches how I feel about it, but I expected some disagreement (from elsewhere, not from you). I'll send another patch later. Replacement patch OK? (vs. update) -- ~Randy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] lib + ntfs: let modules force HWEIGHT
On Tue, 28 Nov 2006 14:08:40 -0800 Randy Dunlap <[EMAIL PROTECTED]> wrote: > From: Randy Dunlap <[EMAIL PROTECTED]> > > NTFS (=m) uses hweight32(), but that function is only linked > into the kernel image if it is used inside the kernel image, > not in loadable modules. Let modules force HWEIGHT to be > built into the kernel image. Otherwise build fails: > > Building modules, stage 2. > MODPOST 94 modules > WARNING: "hweight32" [fs/ntfs/ntfs.ko] undefined! > > Yes, I'd certainly prefer for this to be more automated rather than > forced by each module that needs it. Perhaps we should just put it in lib-y and remove CONFIG_GENERIC_HWEIGHT. It's either part of the API or it ain't. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/6] ext2 balloc: fix _with_rsv freeze
On Tue, 2006-11-28 at 20:07 +, Hugh Dickins wrote: > On Tue, 28 Nov 2006, Mingming Cao wrote: > > On Tue, 2006-11-28 at 17:40 +, Hugh Dickins wrote: > > > After several days of testing ext2 with reservations, it got caught inside > > > ext2_try_to_allocate_with_rsv: alloc_new_reservation repeatedly succeeding > > > on the window [12cff,12d0e], ext2_try_to_allocate repeatedly failing to > > > find the free block guaranteed to be included (unless there's contention). > > > > > > > Hmm, I suspect there is other issue: alloc_new_reservation should not > > repeatedly allocating the same window, if ext2_try_to_allocate > > repeatedly fails to find a free block in that window. > > find_next_reservable_window() takes my_rsv (the old window that he > > thinks there is no free block) as a guide to find a window "after" the > > end block of my_rsv, so how could this happen? > > Hmmm. I haven't studied that part of the code, but what you say sounds > sensible: that would leave more to be explained, yes. I guess it would > happen if all the rest of the bitmap were either allocated or reserved, But bitmap_search_next_usable_block() will fail in the case the rest of bitmap were allocated, and find_next_reservable_space() will fail in the case the rest of group were all reserved. alloc_new_reservation() should not create a new window in this case. > but I don't believe that was the case here: I have noted that the map > was all 00s from offset 0x1ae onwards, plenty unallocated; I've not > recorded the following reservations, but it seems unlikely they covered > the remaining free area (and still covered it even when the remaining > tasks got to the point of just waiting for this one). > > > > > > Fix the range to find_next_usable_block's memscan: the scan from "here" > > > (0xcfe) up to (but excluding) "maxblocks" (0xd0e) needs to scan 3 bytes > > > not 2 (the relevant bytes of bitmap in this case being f7 df ff - none > > > 00, but the premature cutoff implying that the last was found 00). > > > > > > > alloc_new_reservation() reserved a window with free block, when come to > > the time to claim it, it scans the window again. So it seems that the > > range of the the scan is too small: > > The range of the scan is 1 byte too small in this case, yes. > > > > > p = ((char *)bh->b_data) + (here >> 3); > > r = memscan(p, 0, (maxblocks - here + 7) >> 3); > > next = (r - ((char *)bh->b_data)) << 3; > > > > -> next is -1 > > I don't understand you: next was not -1, it was 0xd08. > > > if (next < maxblocks && next >= here) > > return next; > > > > --> falls to false branch > > No, it passed the "next < maxblocks && next >= here" test > (maxblocks being 0xd0e and here being 0xcfe), so returned > pointing to an allocated block - then the caller finds it > cannot set the bit. > Apologies for the confusion. I thought ext2_try_to_allocate() failed because we could not find a free block in the reserved window (i.e., find_next_usable_block() failed) It seems in this case, find_next_usable_block() incorrectly returns a bit it *thinks* free, but ext2_try_to_allocate() fails to claim it as it's being marked as used. So yes, Acked this fix. Thanks. > > > > here = bitmap_search_next_usable_block(here, bh, maxblocks); > > return here; > > > > So we failed to find a free byte in the range. That's seems fine to me. > > It's only a nice thing to have -- try to allocate a block in a place > > where it's neighbors are all free also. If it fails, it will search the > > window bit by bit. So I don't understand why it is not being recovered > > by bitmap_search_next_usable_block(), which test the bitmap bit by bit? > > It already returned, it doesn't reach that line. > Yep. > > > > > Is this a problem for mainline ext2? No, because the "size" in its > > > memscan > > > is always EXT2_BLOCKS_PER_GROUP(sb), which mkfs.ext2 requires to be a > > > multiple of 8. Is this a problem for ext3 or ext4? No, because they have > > > an additional extN_test_allocatable test which rescues them from the > > > error. > > > > > Hmm, if the error is it prematurely think there is no free block in the > > range (bitmap on disk), then even in ext3/4, it will not bother checking > > the jbd copy of the bitmap. I am not sure this is the cause that ext3/4 > > may not has the problem. > > In the ext3/4 case, it indeed won't bother to check the jbd copy > (having found this bitmap bit set), it'll fall through to the > bitmap_search_next_usable_block you indicated above, > and that should do the right thing, finding the first > free bit in the area originally reserved. > Make sense. > > > > > But the bigger question is, why does the my_rsv case come here to > > > find_next_usable_block at all? > > > > Because grp_goal is -1? > > Well, yes, but my point is that we've got a reservation, and we're > hoping to
Re: 2.6.19-rc6-mm2
I've found a couple of bugs so far... 1. I did `modprobe kvm' and then tried running a version of the KVM Qemu compiled for a different kernel. My mistake. But I got an oops: BUG: unable to handle kernel NULL pointer dereference at virtual address 0008 printing eip: f91f9c3f *pde = Oops: [#1] SMP last sysfs file: /devices/system/cpu/cpu0/cpufreq/scaling_max_freq Modules linked in: kvm iTCO_wdt i8k rfcomm l2cap rtc sdhci mmc_block mmc_core hci_usb bluetooth b44 mii ohci1394 ieee1394 uhci_hcd ehci_hcd usbcore psmouse evdev i915 drm cpuid msr speedstep_centrino video thermal processor fan container button battery ac CPU:0 EIP:0060:[]Not tainted VLI EFLAGS: 00010202 (2.6.19-rc6-mm1 #1) EIP is at kvm_vmx_return+0xef/0x4d0 [kvm] eax: e5490068 ebx: ecx: edx: e5491ca4 esi: edi: e5490060 ebp: e5a4fde0 esp: e5a4fd54 ds: 007b es: 007b ss: 0068 Process qemu (pid: 24193, ti=e5a4e000 task=c2286a90 task.ti=e5a4e000) Stack: 0002 0001 f7fe1278 0002 b7f92000 e549 e5a4fdac 00d8 f783a580 e5a4fdac c043b98a bfb93f7c f91fa020 e5a4fde0 bfb93f7c bfb93f7c f91fa0cb 04f3 c03fb974 e549 Call Trace: [] kvm_dev_ioctl+0x0/0x1040 [kvm] [] kvm_dev_ioctl+0xab/0x1040 [kvm] [] error_code+0x7c/0x84 [] kmap_atomic+0xc9/0xe0 [] permission+0x2b/0xd0 [] sys_swapon+0x978/0xaf0 [] kunmap_atomic+0x63/0x70 [] kmap_atomic+0xc9/0xe0 [] kunmap_atomic+0x63/0x70 [] get_page_from_freelist+0x27d/0x340 [] kmap_atomic+0xc9/0xe0 [] kunmap_atomic+0x63/0x70 [] get_page_from_freelist+0x27d/0x340 [] find_get_page+0x20/0x60 [] filemap_nopage+0x2dc/0x490 [] do_sync_read+0xc7/0x110 [] kmap_atomic+0xc9/0xe0 [] kunmap_atomic+0x63/0x70 [] __handle_mm_fault+0x246/0x9c0 [] kvm_dev_ioctl+0x0/0x1040 [kvm] [] scsi_host_alloc+0x202/0x2a0 [] do_ioctl+0x2b/0x90 [] vfs_ioctl+0x5c/0x2b0 [] sys_ioctl+0x3d/0x70 [] syscall_call+0x7/0xb [] scsi_host_alloc+0x202/0x2a0 === Code: 14 0f 87 77 02 00 00 8b 0c b5 00 15 20 f9 85 c9 0f 84 68 02 00 00 89 ea 89 f8 ff d1 85 c0 0f 84 4c 02 00 00 89 f8 e8 31 e9 ff ff <65> a1 08 00 00 00 8b 40 04 8b 40 08 a8 04 0f 85 ae 02 00 00 e8 EIP: [] kvm_vmx_return+0xef/0x4d0 [kvm] SS:ESP 0068:e5a4fd54 msrs: 2 Oh, and I get a ton of these messages with kvm: rtc: lost some interrupts at 1024Hz. 2. I'm not sure if this bug is in the kernel, wireless tools, or the ipw3945 driver, but I haven't changed the version of anything but the kernel. When I do `iwconfig eth1 essid foobar' something drops the last character of the essid, and a subsequent `iwconfig eth1' shows "fooba" as the essid. And it's actually set as "fooba", since I had to do `iwconfig eth1 essid MyUsualEssid_' (note underscore) to get on to my usual network. --Thomas Tuttle pgpEvsWJPAyNU.pgp Description: PGP signature
Re: [PATCH] prune_icache_sb
On Tue, 28 Nov 2006 16:41:07 -0500 Wendy Cheng <[EMAIL PROTECTED]> wrote: > Andrew Morton wrote: > > On Mon, 27 Nov 2006 18:52:58 -0500 > > Wendy Cheng <[EMAIL PROTECTED]> wrote: > > > > > >> Not sure about walking thru sb->s_inodes for several reasons > >> > >> 1. First, the changes made are mostly for file server setup with large > >> fs size - the entry count in sb->s_inodes may not be shorter then > >> inode_unused list. > >> > > > > umm, that's the best-case. We also care about worst-case. Think: > > 1,000,000 inodes on inode_unused, of which a randomly-sprinkled 10,000 are > > from the being-unmounted filesytem. The code as-proposed will do 100x more > > work that it needs to do. All under a global spinlock. > > > By walking thru sb->s_inodes, we also need to take inode_lock and > iprune_mutex (?), since we're purging the inodes from the system - or > specifically, removing them from inode_unused list. There is really not > much difference from the current prune_icache() logic. There's quite a bit of difference. The change you're proposing will perform poorly if it is used in the scenario which I describe above. It will waste CPU cycles and will destroy the inode_unused LRU ordering (for what that's worth, which isn't much). Trust me, every single time we've had an inefficient search in core kernel, someone has gone and done something which hits it and causes general meltdown in their workload. So we've had to make significant changes to remove the O(n) or higher search complexity. And in this case we *already have* the date structures in place to make it O(1). > What's been > proposed here is simply *exporting* the prune_icache() kernel code to > allow filesystems to trim (purge a small percentage of ) its > (potentially will be) unused per-mount inodes for *latency* considerations. It just happens to work in your setup. If you have a large machine with two filesystems and you run rsync on both filesystems and run FTP agains one of them, it might not work so well. Because the proposed prune_icache_sb() might need to chew through 500,000 inodes from the wrong superblock before reclaiming any of the inodes which you want to reclaim. Or something like that. > I made a mistake by using the "page dirty ratio" to explain the problem > (sorry! I was not thinking well in previous write-up) that could mislead > you to think this is a VM issue. This is not so much about > low-on-free-pages (and/or memory fragmentation) issue (though > fragmentation is normally part of the symptoms). What the (external) > kernel module does is to tie its cluster-wide file lock with in-memory > inode that is obtained during file look-up time. The lock is removed > from the machine when > > 1. the lock is granted to other (cluster) machine; or > 2. the in-memory inode is purged from the system. It seems peculiar to be tying the lifetime of a DLM lock to the system's memory size and current memory pressure? > One of the clusters that has this latency issue is an IP/TV application > where it "rsync" with main station server (with long geographical > distance) every 15 minutes. It subsequently (and constantly) generates > large amount of inode (and locks) hanging around. When other nodes, > served as FTP servers, within the same cluster are serving the files, > DLM has to wade through huge amount of locks entries to know whether the > lock requests can be granted. That's where this latency issue gets > popped out. Our profiling data shows when the cluster performance is > dropped into un-acceptable ranges, DLM could hogs 40% of CPU cycle in > lock searching logic. From VM point of view, the system does not have > memory shortage so it doesn't have a need to kick off prune_icache() call. OK.. > This issue could also be fixed in several different ways - maybe by a > better DLM hash function, It does sound like the lock lookup is broken. I assume there's some reason for keeping these things floating about in memory, so there must be a downside to artificially pruning them in this manner? If so, a (much) faster lookup would seem to be the best fix. > maybe by asking IT people to umount the > filesystem where *all* per-mount inodes are unconditionally purged (but > it defeats the purpose of caching inodes and, in our case, the locks) > after each rsync, , etc. But I do think the proposed patch is the > most sensible way to fix this issue and believe it will be one of these > functions that if you export it, people will find a good use of it. It > helps with memory fragmentation and/or shortage *before* it becomes a > problem as well. I certainly understand and respect a maintainer's > daunting job on how to take/reject a patch - let me know how you think > so I can start to work on other solutions if required. We shouldn't export this particular implementation to modules because it has bad failure modes. There might be a case for exposing an
Re: [PATCH] Don't compare unsigned variable for <0 in sys_prctl()
On Wed, 29 Nov 2006, Jesper Juhl wrote: > > I would venture that "-Wshadow" is another one of those. I'd agree, except for the fact that gcc does a horribly _bad_ job of -Wshadow, making it (again) totally unusable. For example, it's often entirely interesting to hear about local variables that shadow each other. No question about it. HOWEVER. It's _not_ really interesting to hear about a local variable that happens to have a common name that is also shared by a extern function. There just isn't any room for confusion, and it's actually not even that unusual - I tried using -Wshadow on real programs, and it was just horribly irritating. In the kernel, we had obvious things like local use of "jiffies" that just make _total_ sense in a small inline function, and the fact that there happens to be an extern declaration for "jiffies" just isn't very interesting. Similarly, with nested macro expansion, even the "local variable shadows another local variable" case - that looks like it should have an obvious warning on the face of it - really isn't always necessarily that interesting after all. Maybe it is a bug, maybe it isn't, but it's no longer _obviously_ bogus any more. So I'm not convinced about the usefulness of "-Wshadow". ESPECIALLY the way that gcc implements it, it's almost totally useless in real life. For example, I tried it on "git" one time, and this is a perfect example of why "-Wshadow" is totally broken: diff-delta.c: In function 'create_delta_index': diff-delta.c:142: warning: declaration of 'index' shadows a global declaration (and there's a _lot_ of those). If I'm not allowed to use "index" as a local variable and include at the same time, something is simply SERIOUSLY WRONG with the warning. So the fact is, the C language has scoping rules for a reason. Can you screw yourself by usign them badly? Sure. But that does NOT mean that the same name in different scopes is a bad thing that should be warned about. If I wanted a language that didn't allow me to do anything wrong, I'd be using Pascal. As it is, it turns out that things that "look" wrong on a local level are often not wrong after all. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET] dont insert socket dentries into dentry_hashtable.
On Tue, 28 Nov 2006 15:35:31 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: > > Andrew, I'm fine with these three patches, specifically: > > [PATCH] dont insert pipe dentries into dentry_hashtable. > [PATCH] [DCACHE] : avoid RCU for never hashed dentries > [PATCH] [NET] dont insert socket dentries into dentry_hashtable. > > Could you toss them into -mm if you haven't already? They were in rc6-mm2. > This > makes better sense then me putting it into net-2.6.20 since > it touches FS stuff. > No probs, they're all lined up and ready to go, thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm patch] drivers/mtd/nand/rtc_from4.c: use lib/bitrev.c
On Tue, 28 Nov 2006 22:52:16 + David Woodhouse <[EMAIL PROTECTED]> wrote: > > I'll take that as an ack and shall merge this once > > crc32-replace-bitreverse-by-bitrev32.patch is merged ;) > > I assume the bitrev thing will be going in as soon as 2.6.19 is actually > released, It will take over a week after 2.6.19 - I prefer to wait until the git tree laggards^Wowners have merged before merging -mm stuff, so things land in appropriate order. > so there's no point in me reverting it from the mtd tree? Your call. I do have a fixlet against this patch: --- a/drivers/mtd/nand/rtc_from4.c~drivers-mtd-nand-rtc_from4c-use-lib-bitrevc-tidy +++ a/drivers/mtd/nand/rtc_from4.c @@ -357,7 +357,7 @@ static int rtc_from4_correct_data(struct /* Read the syndrom pattern from the FPGA and correct the bitorder */ rs_ecc = (volatile unsigned short *)(rtc_from4_fio_base + RTC_FROM4_RS_ECC); for (i = 0; i < 8; i++) { - ecc[i] = byte_rev_table[(*rs_ecc) & 0xFF]; + ecc[i] = bitrev8(*rs_ecc); rs_ecc++; } _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2 -mm] fault-injection: safer defaults, trivial optimization, cleanup
On Tue, 28 Nov 2006 14:50:45 -0800 Don Mullis <[EMAIL PROTECTED]> wrote: > On Tue, 2006-11-28 at 13:37 -0800, Andrew Morton wrote: > > > We'd prefer one-patch-per-concept, please. This all sounds like about > > six patches. > > Understood. > > > We _could_ merge this patch as-is, but it means that when this stuff > > finally hits mainline it would go in as a nice sequence of logical patches, > > followed by a random thing which is splattered all over all the preceding > > patches. > > Does this argue for a respin of the original patches, folding in > content from this one, rather than splitting it into an additional six to > be appended to the series? If the fixes are one-patch-per-concept, and if the original patch series is one-patch-per-concept (it is) then I can usually insert the fixups in the right place, later fold each into its appropriate base patch and everything lands in git squeaky-clean. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-rt8
Am Dienstag, 28. November 2006 23:40 schrieb Karsten Wiese: > Am Montag, 27. November 2006 10:49 schrieb Ingo Molnar: > > i have released the 2.6.19-rc6-rt8 tree, which can be downloaded from > > I saw usb transport errors here before rebooting with > nmi_watchdog=0 > contained in kernel command line. > > Testcase stalled within 2 minutes before change, > ticks happily after change for 15 minutes now. > .config is a "release" type, no debugging options. After estimated 15 minutes more it bugged again. Related dmesg translates to linux error -EXDEV propably caused by the following lines: static int uhci_result_isochronous(struct uhci_hcd *uhci, struct urb *urb) { struct uhci_td *td, *tmp; struct urb_priv *urbp = urb->hcpriv; struct uhci_qh *qh = urbp->qh; list_for_each_entry_safe(td, tmp, >td_list, list) { unsigned int ctrlstat; int status; int actlength; if (uhci_frame_before_eq(uhci->cur_iso_frame, qh->iso_frame)) return -EINPROGRESS; uhci_remove_tds_from_frame(uhci, qh->iso_frame); ctrlstat = td_status(td); if (ctrlstat & TD_CTRL_ACTIVE) { status = -EXDEV;/* TD was added too late? */ Karsten - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.15.4 rel.2 1/1] libata: add hotswap to sata_svw
On Tue, 2006-11-28 at 23:22 +, David Woodhouse wrote: > On Thu, 2006-02-16 at 16:09 +0100, Martin Devera wrote: > > From: Martin Devera <[EMAIL PROTECTED]> > > > > Add hotswap capability to Serverworks/BroadCom SATA controlers. The > > controler has SIM register and it selects which bits in SATA_ERROR > > register fires interrupt. > > The solution hooks on COMWAKE (plug), PHYRDY change and 10B8B decode > > error (unplug) and calls into Lukasz's hotswap framework. > > The code got one day testing on dual core Athlon64 H8SSL Supermicro > > MoBo with HT-1000 SATA, SMP kernel and two CaviarRE SATA HDDs in > > hotswap bays. > > > > Signed-off-by: Martin Devera <[EMAIL PROTECTED]> > > What became of this? I might be to blame for not testing it... The Xserve I had on my desk was too noisy for most of my co-workers so I kept delaying and forgot about it Also the Xserve I have only has one disk, which makes hotplug testing a bit harder :-) Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc PATCH] ieee1394: ohci1394: delete bogus spinlock, flush MMIO writes
Alan wrote: > On Tue, 28 Nov 2006 22:24:11 +0100 (CET) > Stefan Richter <[EMAIL PROTECTED]> wrote: >> All MMIO writes which were surrounded by the spinlock as well as the >> very last MMIO write of the IRQ handler are now explicitly flushed by >> MMIO reads of the respective register. > > MMIO is ordered anyway on the bus, you just need mmiowb() to force > ordering to the bus controller in case you are on a big numa box. The mmiowb is a checkpoint to ensure ordering between different threads of MMIO writes; i.e. it doesn't halt the thread until the write actually reached the device like a read would do, right? -- Stefan Richter -=-=-==- =-== ===-= http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: failed 'ljmp' in linear addressing mode
On Tue, 28 Nov 2006, Jun Sun wrote: > On Tue, Nov 28, 2006 at 08:46:44AM -0500, linux-os (Dick Johnson) wrote: >> >> On Mon, 27 Nov 2006, Jun Sun wrote: >> >>> >>> On Mon, Nov 27, 2006 at 08:58:57AM -0500, linux-os (Dick Johnson) wrote: I think it probably resets the instant that you turn off paging. To turn off paging, you need to copy some code (properly linked) to an area where there is a 1:1 mapping between virtual and physical addresses. A safe place is somewhere below 1 megabyte. Then you need to set up a call descriptor so you can call that code (you can ljump if you never plan to get back). You then need to clear interrupts on all CPUs (use a spin-lock). Once you are executing from the new area, you reset your segments to the new area. The call descriptor would have already set CS, as would have the long-jump. At this time you can turn off paging and flush the TLB. You are now in linear-address protected mode. >>> >>> Thanks for the reply. But I am pretty much sure I did above correctly. >>> I use single-instruction infinite loop in the call path to verify >>> that control does reach last 'ljmp' but not the jump destination. >>> >>> Below is the hack I made to machine_kexec.c file. As you can see, I >>> managed to make the identical mapping between virtual and physical >>> addresses. >>> >>> Note I did not copy the code into the first 1M. In fact the code >>> is located at 0xc0477000 (0x00477000 in physical). I thought that should be >>> OK as I did not really go all the way back to real-address mode. >>> >>> That last suspect I have now is the wrong value in CS descriptor. Does >>> kernel >>> have a suitable CS descriptor for the last ljmep to 0x1000 in linear >>> addressing mode? The CS descriptor seems to be a pretty dark magic to me >>> ... >>> >>> Cheers. >>> >>> Jun >>> >>> - >>> diff -Nru linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c.orig >>> linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c >>> --- linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c.orig 2006-10-13 >>> 11:55:04.0 -0700 >>> +++ linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c2006-11-22 >>> 15:01:45.0 -0800 >>> @@ -212,3 +212,19 @@ >>>rnk = (relocate_new_kernel_t) reboot_code_buffer; >>>(*rnk)(page_list, reboot_code_buffer, image->start, cpu_has_pae); >>> } >>> + >>> +extern void do_os_switching(void); >>> +void os_switch(void) >>> +{ >>> + void (*foo)(void); >>> + >>> + /* absolutely no irq */ >>> + local_irq_disable(); >>> + >>> + /* create identity mapping */ >>> + foo=virt_to_phys(do_os_switching); >>> + identity_map_page((unsigned long)foo); >>> + >>> + /* jump to the real address */ >>> + foo(); >>> +} >>> >> Get a copy of the Intel 486 Microprocessor Reference Manual or read it on- >> line. There is no way that you can make a call like that. > > By "a call like that", you mean "foo()"? Are you sure about that? > > The machine_kexec() function in the same file is basically doing the > same way (i.e., use "call *$eax" instead of "ljmp"). That is where I got > my idea from. > > In addition, if I put "1: jmp 1b" instruction anywhere *inside* > do_os_switching() I would get infinite hanging instead of reboot, > which seems to suggest I *did* jump into do_os_switching() successfully. > > According to Intel Architecture Software Developer's Manual (1997), Vol 3, > page 8-14: > > "2. If paging is enabled perform the following operations: > > - Transfer program control to linear addresses that are identity mapped to >physical addresses (that is, linear addresses equal physical addresses) > ... > " > > it does not indicate one has to use "ljmp" to do this control transfer. Assume you are accessing memory at 0xc000-. This address, when page translation is occurring (page 5-17), consists of three parts. (1) A 12-bit offset 0:11 (2) A 10-bit index 11:21 (3) A 10-bit index 21:31 So 0xc00 is an index into the page directory. If you wish to turn off translation, you can't just turn off those bits. The next instruction will be fetched from memory with the page-cache upper bits reset, i.e, using offset 0 of the page directory. You somehow need to turn off those bits at the same time the next instruction is fetched. Normally you use a call gate. However, you can do a long jump which reloads the segment register. When the instruction book says "transfer control" it doesn't mean just jump to some offset. When the instruction address is 0xC000-, it is not the same as 0x-. These two addresses are different (to the CPU) until after those page translation bits are reset, not before. > >> You would need to >> call through a task-gate or otherwise set the code-segment and the >> instruction >> pointer at the same instant. First, look at the startup code for a GDT entry >> that maps the linear address-space you are
2.6.18 tsc clocksource + ntp = excessive drift; acpi_pm does fine.
Hi, with default boot I got tsc clocksource selected on an debian's 2.6.18-3-k7 SMP build (but UP machine). ntp keeps bothering me with this message: frequency error 512 PPM exceeds tolerance 500 PPM If I remove ntp's drift file and restart, it goes fine for a while and then it goes with that behaviour again. If I remove ntp's drift file, then do a: echo acpi_pm >/sys/devices/system/clocksource/clocksource0/available_clocksource ; and then restart ntp, it goes fine "forever". Any toughs, something I should look at? I'll be glad to give more feedback. I don't know if that happened with 2.6.17, but I'm pretty sure that with 2.6.16 it was fine. - Alexandre - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Don't compare unsigned variable for <0 in sys_prctl()
On 29/11/06, Linus Torvalds <[EMAIL PROTECTED]> wrote: On Tue, 28 Nov 2006, Jesper Juhl wrote: > > > Friends don't let friends use "-W". > > Hehe, ok, I'll stop cleaning this stuff up then. > Nice little hobby out the window there ;) You might want to look at some of the other warnings gcc spits out, but this class isn't one of them. Other warnings we have added over the years (and that really _are_ good warnings) have included the "-Wstrict-prototypes", and some other ones. If you can pinpoint _which_ gcc warning flag it is that causes gcc to emit the bogus ones, you _could_ try "-W -Wno-xyz-warning", which should cause gcc to enable all the "other" warnings, but then not the "xyz-warning" that causes problems. Of course, there is often a reason why a warning is in "-W" but not in "-Wall". Most of the time it's sign that the warning is bogus. Not always, though - we do tend to want to be fairly strict, and Wstrict-prototypes is an example of a _good_ warning that is not in -Wall. I would venture that "-Wshadow" is another one of those. I've, in the past, submitted quite a few patches to clean up shadow warnings (some accepted, some not) and I'll probably try going down that path again in the near future. It's a class of warnings that have the potential to uncover real bugs (even if we don't currently have any) and it would be a nice one to be able to enable by default in the Makefile. I agree with you though that the "expression always false|true due to unsigned" type of warnings are usually bogus - although there have actually been real bugs hiding behind some of those warnings in the past. But, I'll make sure to only submit patches for that type of warnings in the future if I can prove that the warning actually uncovered a real bug. -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/