[patch 03/23] Add get_unaligned to ieee80211_get_radiotap_len
-stable review patch. If anyone has any objections, please let us know. -- From: Andy Green <[EMAIL PROTECTED]> patch dfe6e81deaa79c85086c0cc8d85b229e444ab97f in mainline. ieee80211_get_radiotap_len() tries to dereference radiotap length without taking care that it is completely unaligned and get_unaligned() is required. Signed-off-by: Andy Green <[EMAIL PROTECTED]> Signed-off-by: John W. Linville <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- net/mac80211/ieee80211.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/net/mac80211/ieee80211.c +++ b/net/mac80211/ieee80211.c @@ -350,7 +350,7 @@ static int ieee80211_get_radiotap_len(st struct ieee80211_radiotap_header *hdr = (struct ieee80211_radiotap_header *) skb->data; - return le16_to_cpu(hdr->it_len); + return le16_to_cpu(get_unaligned(>it_len)); } #ifdef CONFIG_MAC80211_LOWTX_FRAME_DUMP -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 01/23] mac80211: filter locally-originated multicast frames
-stable review patch. If anyone has any objections, please let us know. -- From: John W. Linville <[EMAIL PROTECTED]> patch b331615722779b078822988843ddffd4eaec9f83 in mainline. In STA mode, the AP will echo our traffic. This includes multicast traffic. Receiving these frames confuses some protocols and applications, notably IPv6 Duplicate Address Detection. Signed-off-by: John W. Linville <[EMAIL PROTECTED]> Signed-off-by: Johannes Berg <[EMAIL PROTECTED]> Acked-by: Michael Wu <[EMAIL PROTECTED]> Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- net/mac80211/ieee80211.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- a/net/mac80211/ieee80211.c +++ b/net/mac80211/ieee80211.c @@ -2836,9 +2836,10 @@ ieee80211_rx_h_data(struct ieee80211_txr memcpy(dst, hdr->addr1, ETH_ALEN); memcpy(src, hdr->addr3, ETH_ALEN); - if (sdata->type != IEEE80211_IF_TYPE_STA) { + if (sdata->type != IEEE80211_IF_TYPE_STA || + (is_multicast_ether_addr(dst) && +!compare_ether_addr(src, dev->dev_addr))) return TXRX_DROP; - } break; case 0: /* DA SA BSSID */ -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 00/23] 2.6.23-stable review, network changes
This is the start of the stable review cycle for the 2.6.23.X release. There are 23 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let us know. If anyone is a maintainer of the proper subsystem, and wants to add a Signed-off-by: line to the patch, please respond with it. These patches are sent out with a number of different people on the Cc: line. If you wish to be a reviewer, please email [EMAIL PROTECTED] to add your name to the list. If you want to be off the reviewer list, also email us. Responses should be made by Friday 00:00:00 UTC. Anything received after that time might be too late. This set of patches focuses on only the core kernel. Other sets of patches will follow if you are interested in those instead. The diffstat of this review series is included below. thanks, greg k-h -- include/linux/netlink.h |2 include/linux/skbuff.h |3 include/net/9p/9p.h | 12 ++ ipc/mqueue.c|6 - net/8021q/vlan.c|5 + net/ieee80211/ieee80211_crypt_tkip.c|2 net/ieee80211/softmac/ieee80211softmac_wx.c |2 net/ipv4/ipcomp.c |3 net/ipv6/ipcomp6.c |3 net/mac80211/ieee80211.c| 55 +++- net/mac80211/ieee80211_ioctl.c | 11 ++ net/mac80211/ieee80211_sta.c| 128 +++- net/netfilter/nf_conntrack_proto_tcp.c | 38 +++- net/netlink/af_netlink.c| 10 +- net/sched/cls_u32.c |4 net/sched/sch_api.c |5 - net/sched/sch_teql.c|3 net/socket.c|6 + 18 files changed, 161 insertions(+), 137 deletions(-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 17/19] x86 setup: sizeof() is unsigned, unbreak comparisons
-stable review patch. If anyone has any objections, please let us know. -- From: H. Peter Anvin <[EMAIL PROTECTED]> patch e6e1ace9904b72478f0c5a5aa7bd174cb6f62561 in mainline. We use signed values for limit checking since the values can go negative under certain circumstances. However, sizeof() is unsigned and forces the comparison to be unsigned, so move the comparison into the heap_free() macros so we can ensure it is a signed comparison. Signed-off-by: H. Peter Anvin <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/i386/boot/boot.h |4 ++-- arch/i386/boot/video-bios.c |2 +- arch/i386/boot/video-vesa.c |2 +- arch/i386/boot/video.c |2 +- 4 files changed, 5 insertions(+), 5 deletions(-) --- a/arch/i386/boot/boot.h +++ b/arch/i386/boot/boot.h @@ -216,9 +216,9 @@ static inline char *__get_heap(size_t s, #define GET_HEAP(type, n) \ ((type *)__get_heap(sizeof(type),__alignof__(type),(n))) -static inline int heap_free(void) +static inline bool heap_free(size_t n) { - return heap_end-HEAP; + return (int)(heap_end-HEAP) >= (int)n; } /* copy.S */ --- a/arch/i386/boot/video-bios.c +++ b/arch/i386/boot/video-bios.c @@ -79,7 +79,7 @@ static int bios_probe(void) video_bios.modes = GET_HEAP(struct mode_info, 0); for (mode = 0x14; mode <= 0x7f; mode++) { - if (heap_free() < sizeof(struct mode_info)) + if (!heap_free(sizeof(struct mode_info))) break; if (mode_defined(VIDEO_FIRST_BIOS+mode)) --- a/arch/i386/boot/video-vesa.c +++ b/arch/i386/boot/video-vesa.c @@ -57,7 +57,7 @@ static int vesa_probe(void) while ((mode = rdfs16(mode_ptr)) != 0x) { mode_ptr += 2; - if (heap_free() < sizeof(struct mode_info)) + if (!heap_free(sizeof(struct mode_info))) break; /* Heap full, can't save mode info */ if (mode & ~0x1ff) --- a/arch/i386/boot/video.c +++ b/arch/i386/boot/video.c @@ -371,7 +371,7 @@ static void save_screen(void) saved.curx = boot_params.screen_info.orig_x; saved.cury = boot_params.screen_info.orig_y; - if (heap_free() < saved.x*saved.y*sizeof(u16)+512) + if (!heap_free(saved.x*saved.y*sizeof(u16)+512)) return; /* Not enough heap to save the screen */ saved.data = GET_HEAP(u16, saved.x*saved.y); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 18/19] x86: fix TSC clock source calibration error
-stable review patch. If anyone has any objections, please let us know. -- From: Dave Johnson <[EMAIL PROTECTED]> patch edaf420fdc122e7a42326fe39274c8b8c9b19d41 in mainline. I ran into this problem on a system that was unable to obtain NTP sync because the clock was running very slow (over 1ppm slow). ntpd had declared all of its peers 'reject' with 'peer_dist' reason. On investigation, the tsc_khz variable was significantly incorrect causing xtime to run slow. After a reboot tsc_khz was correct so I did a reboot test to see how often the problem occurred: Test was done on a 2000 Mhz Xeon system. Of 689 reboots, 8 of them had unacceptable tsc_khz values (>500ppm): range of tsc_khz # of boots % of boots -- -- < 1999750 0 0.000% 1999750 - 1999800 21 3.048% 1999800 - 1999850 166 24.128% 1999850 - 100 241 35.029% 100 - 150 211 30.669% 150 - 200 42 6.105% 200 - 200 0 0.000% 250 - 2000100 0 0.000% [...] 2000100 - 2015000 1 0.145% << BAD 2015000 - 203 6 0.872% << BAD 203 - 2045000 1 0.145% << BAD 2045000 < 0 0.000% The worst boot was 2032.577 Mhz, over 1.5% off! It appears that on rare occasions, mach_countup() is taking longer to complete than necessary. I suspect that this is caused by the CPU taking a periodic SMI interrupt right at the end of the 30ms calibration loop. This would cause the loop to delay while the SMI BIOS hander runs. The resulting TSC value is beyond what it actually should be resulting in a higher tsc_khz. The below patch makes native_calculate_cpu_khz() take the best (shortest duration, lowest khz) run of it's 3 calibration loops. If a SMI goes off causing a bad result (long duration, higher khz) it will be discarded. With the patch applied, 300 boots of the same system produce good results: range of tsc_khz # of boots % of boots -- -- < 1999750 0 0.000% 1999750 - 1999800 30 10.000% 1999800 - 1999850 166 55.333% 1999850 - 100 89 29.667% 100 - 150 15 5.000% 150 < 0 0.000% Problem was found and tested against 2.6.18. Patch is against 2.6.22. Signed-off-by: Dave Johnson <[EMAIL PROTECTED]> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/i386/kernel/tsc.c |5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) --- a/arch/i386/kernel/tsc.c +++ b/arch/i386/kernel/tsc.c @@ -137,7 +137,7 @@ unsigned long native_calculate_cpu_khz(v { unsigned long long start, end; unsigned long count; - u64 delta64; + u64 delta64 = (u64)ULLONG_MAX; int i; unsigned long flags; @@ -149,6 +149,7 @@ unsigned long native_calculate_cpu_khz(v rdtscll(start); mach_countup(); rdtscll(end); + delta64 = min(delta64, (end - start)); } /* * Error: ECTCNEVERSET @@ -159,8 +160,6 @@ unsigned long native_calculate_cpu_khz(v if (count <= 1) goto err; - delta64 = end - start; - /* cpu freq too fast: */ if (delta64 > (1ULL<<32)) goto err; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 19/19] revert "x86_64: allocate sparsemem memmap above 4G"
-stable review patch. If anyone has any objections, please let us know. -- From: Linus Torvalds <[EMAIL PROTECTED]> Reverted upstream by commit 6a22c57b8d2a62dea7280a6b2ac807a539ef0716 Revert this commit: commit 2e1c49db4c640b35df13889b86b9d62215ade4b6 Author: Zou Nan hai <[EMAIL PROTECTED]> Date: Fri Jun 1 00:46:28 2007 -0700 x86_64: allocate sparsemem memmap above 4G This reverts commit 2e1c49db4c640b35df13889b86b9d62215ade4b6. First off, testing in Fedora has shown it to cause boot failures, bisected down by Martin Ebourne, and reported by Dave Jobes. So the commit will likely be reverted in the 2.6.23 stable kernels. Secondly, in the 2.6.24 model, x86-64 has now grown support for SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the bug is not visible any more, it's become invisible due to the code just being irrelevant and no longer enabled on the only architecture that this ever affected. Reported-by: Dave Jones <[EMAIL PROTECTED]> Tested-by: Martin Ebourne <[EMAIL PROTECTED]> Cc: Zou Nan hai <[EMAIL PROTECTED]> Cc: Suresh Siddha <[EMAIL PROTECTED]> Cc: Andrew Morton <[EMAIL PROTECTED]> Acked-by: Andy Whitcroft <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Cc: Chuck Ebbert <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/x86_64/mm/init.c |6 -- include/linux/bootmem.h |1 - mm/sparse.c | 11 --- 3 files changed, 18 deletions(-) --- a/arch/x86_64/mm/init.c +++ b/arch/x86_64/mm/init.c @@ -734,12 +734,6 @@ int in_gate_area_no_task(unsigned long a return (addr >= VSYSCALL_START) && (addr < VSYSCALL_END); } -void * __init alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size) -{ - return __alloc_bootmem_core(pgdat->bdata, size, - SMP_CACHE_BYTES, (4UL*1024*1024*1024), 0); -} - const char *arch_vma_name(struct vm_area_struct *vma) { if (vma->vm_mm && vma->vm_start == (long)vma->vm_mm->context.vdso) --- a/include/linux/bootmem.h +++ b/include/linux/bootmem.h @@ -59,7 +59,6 @@ extern void *__alloc_bootmem_core(struct unsigned long align, unsigned long goal, unsigned long limit); -extern void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size); #ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE extern void reserve_bootmem(unsigned long addr, unsigned long size); --- a/mm/sparse.c +++ b/mm/sparse.c @@ -215,12 +215,6 @@ static int __meminit sparse_init_one_sec return 1; } -__attribute__((weak)) __init -void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size) -{ - return NULL; -} - static struct page __init *sparse_early_mem_map_alloc(unsigned long pnum) { struct page *map; @@ -231,11 +225,6 @@ static struct page __init *sparse_early_ if (map) return map; - map = alloc_bootmem_high_node(NODE_DATA(nid), - sizeof(struct page) * PAGES_PER_SECTION); - if (map) - return map; - map = alloc_bootmem_node(NODE_DATA(nid), sizeof(struct page) * PAGES_PER_SECTION); if (map) -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 15/19] x86: fix global_flush_tlb() bug
-stable review patch. If anyone has any objections, please let us know. -- From: Ingo Molnar <[EMAIL PROTECTED]> patch 9a24d04a3c26c223f22493492c5c9085b8773d4a upstream While we were reviewing pageattr_32/64.c for unification, Thomas Gleixner noticed the following serious SMP bug in global_flush_tlb(): down_read(_mm.mmap_sem); list_replace_init(_pages, ); up_read(_mm.mmap_sem); this is SMP-unsafe because list_replace_init() done on two CPUs in parallel can corrupt the list. This bug has been introduced about a year ago in the 64-bit tree: commit ea7322decb974a4a3e804f96a0201e893ff88ce3 Author: Andi Kleen <[EMAIL PROTECTED]> Date: Thu Dec 7 02:14:05 2006 +0100 [PATCH] x86-64: Speed and clean up cache flushing in change_page_attr down_read(_mm.mmap_sem); - dpage = xchg(_pages, NULL); + list_replace_init(_pages, ); up_read(_mm.mmap_sem); the xchg() based version was SMP-safe, but list_replace_init() is not. So this "cleanup" introduced a nasty bug. why this bug never become prominent is a mystery - it can probably be explained with the (still) relative obscurity of the x86_64 architecture. the safe fix for now is to write-lock init_mm.mmap_sem. Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Cc: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/x86_64/mm/pageattr.c |9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) --- a/arch/x86_64/mm/pageattr.c +++ b/arch/x86_64/mm/pageattr.c @@ -229,9 +229,14 @@ void global_flush_tlb(void) struct page *pg, *next; struct list_head l; - down_read(_mm.mmap_sem); + /* +* Write-protect the semaphore, to exclude two contexts +* doing a list_replace_init() call in parallel and to +* exclude new additions to the deferred_pages list: +*/ + down_write(_mm.mmap_sem); list_replace_init(_pages, ); - up_read(_mm.mmap_sem); + up_write(_mm.mmap_sem); flush_map(); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 16/19] x86 setup: handle boot loaders which set up the stack incorrectly
-stable review patch. If anyone has any objections, please let us know. -- From: H. Peter Anvin <[EMAIL PROTECTED]> patch 6b6815c6d5d1dc209701d1661a7a0e09a295db2f in mainline. Apparently some specific versions of LILO enter the kernel with a stack pointer that doesn't match the rest of the segments. Make our best attempt at untangling the resulting mess. Signed-off-by: H. Peter Anvin <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/i386/boot/boot.h |4 +-- arch/i386/boot/header.S | 62 ++-- 2 files changed, 46 insertions(+), 20 deletions(-) --- a/arch/i386/boot/boot.h +++ b/arch/i386/boot/boot.h @@ -17,6 +17,8 @@ #ifndef BOOT_BOOT_H #define BOOT_BOOT_H +#define STACK_SIZE 512 /* Minimum number of bytes for stack */ + #ifndef __ASSEMBLY__ #include @@ -198,8 +200,6 @@ static inline int isdigit(int ch) } /* Heap -- available for dynamic lists. */ -#define STACK_SIZE 512 /* Minimum number of bytes for stack */ - extern char _end[]; extern char *HEAP; extern char *heap_end; --- a/arch/i386/boot/header.S +++ b/arch/i386/boot/header.S @@ -173,7 +173,8 @@ ramdisk_size: .long 0 # its size in byt bootsect_kludge: .long 0 # obsolete -heap_end_ptr: .word _end+1024 # (Header version 0x0201 or later) +heap_end_ptr: .word _end+STACK_SIZE-512 + # (Header version 0x0201 or later) # space from here (exclusive) down to # end of setup code can be used by setup # for local heap purposes. @@ -225,28 +226,53 @@ start_of_setup: int $0x13 #endif -# We will have entered with %cs = %ds+0x20, normalize %cs so -# it is on par with the other segments. - pushw %ds - pushw $setup2 - lretw - -setup2: # Force %es = %ds movw%ds, %ax movw%ax, %es cld -# Stack paranoia: align the stack and make sure it is good -# for both 16- and 32-bit references. In particular, if we -# were meant to have been using the full 16-bit segment, the -# caller might have set %sp to zero, which breaks %esp-based -# references. - andw$~3, %sp# dword align (might as well...) - jnz 1f - movw$0xfffc, %sp# Make sure we're not zero -1: movzwl %sp, %esp # Clear upper half of %esp - sti +# Apparently some ancient versions of LILO invoked the kernel +# with %ss != %ds, which happened to work by accident for the +# old code. If the CAN_USE_HEAP flag is set in loadflags, or +# %ss != %ds, then adjust the stack pointer. + + # Smallest possible stack we can tolerate + movw$(_end+STACK_SIZE), %cx + + movwheap_end_ptr, %dx + addw$512, %dx + jnc 1f + xorw%dx, %dx# Wraparound - whole segment available +1: testb $CAN_USE_HEAP, loadflags + jnz 2f + + # No CAN_USE_HEAP + movw%ss, %dx + cmpw%ax, %dx# %ds == %ss? + movw%sp, %dx + # If so, assume %sp is reasonably set, otherwise use + # the smallest possible stack. + jne 4f # -> Smallest possible stack... + + # Make sure the stack is at least minimum size. Take a value + # of zero to mean "full segment." +2: + andw$~3, %dx# dword align (might as well...) + jnz 3f + movw$0xfffc, %dx# Make sure we're not zero +3: cmpw%cx, %dx + jnb 5f +4: movw%cx, %dx# Minimum value we can possibly use +5: movw%ax, %ss + movzwl %dx, %esp # Clear upper half of %esp + sti # Now we should have a working stack + +# We will have entered with %cs = %ds+0x20, normalize %cs so +# it is on par with the other segments. + pushw %ds + pushw $6f + lretw +6: # Check signature at end of setup cmpl$0x5a5aaa55, setup_sig -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 13/19] xen: fix incorrect vcpu_register_vcpu_info hypercall argument
-stable review patch. If anyone has any objections, please let us know. -- From: Jeremy Fitzhardinge <[EMAIL PROTECTED]> patch e3d2697669abbe26c08dc9b95e2a71c634d096ed in mainline. The kernel's copy of struct vcpu_register_vcpu_info was out of date, at best causing the hypercall to fail and the guest kernel to fall back to the old mechanism, or worse, causing random memory corruption. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Stable Kernel <[EMAIL PROTECTED]> Cc: Morten =?utf-8?q?B=C3=B8geskov?= <[EMAIL PROTECTED]> Cc: Mark Williamson <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/i386/xen/enlighten.c|2 +- include/xen/interface/vcpu.h |5 +++-- 2 files changed, 4 insertions(+), 3 deletions(-) --- a/arch/i386/xen/enlighten.c +++ b/arch/i386/xen/enlighten.c @@ -116,7 +116,7 @@ static void __init xen_vcpu_setup(int cp info.mfn = virt_to_mfn(vcpup); info.offset = offset_in_page(vcpup); - printk(KERN_DEBUG "trying to map vcpu_info %d at %p, mfn %x, offset %d\n", + printk(KERN_DEBUG "trying to map vcpu_info %d at %p, mfn %llx, offset %d\n", cpu, vcpup, info.mfn, info.offset); /* Check to see if the hypervisor will put the vcpu_info --- a/include/xen/interface/vcpu.h +++ b/include/xen/interface/vcpu.h @@ -160,8 +160,9 @@ struct vcpu_set_singleshot_timer { */ #define VCPUOP_register_vcpu_info 10 /* arg == struct vcpu_info */ struct vcpu_register_vcpu_info { -uint32_t mfn; /* mfn of page to place vcpu_info */ -uint32_t offset;/* offset within page */ +uint64_t mfn;/* mfn of page to place vcpu_info */ +uint32_t offset; /* offset within page */ +uint32_t rsvd; /* unused */ }; #endif /* __XEN_PUBLIC_VCPU_H__ */ -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 14/19] xfs: eagerly remove vmap mappings to avoid upsetting Xen
-stable review patch. If anyone has any objections, please let us know. -- From: Jeremy Fitzhardinge <[EMAIL PROTECTED]> patch ace2e92e193126711cb3a83a3752b2c5b8396950 in mainline. XFS leaves stray mappings around when it vmaps memory to make it virtually contigious. This upsets Xen if one of those pages is being recycled into a pagetable, since it finds an extra writable mapping of the page. This patch solves the problem in a brute force way, by making XFS always eagerly unmap its mappings. [ Stable: This works around a bug in 2.6.23. We may come up with a better solution for mainline, but this seems like a low-impact fix for the stable kernel. ] Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: XFS masters <[EMAIL PROTECTED]> Cc: Morten =?utf-8?q?B=C3=B8geskov?= <[EMAIL PROTECTED]> Cc: Mark Williamson <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- fs/xfs/linux-2.6/xfs_buf.c | 13 + 1 file changed, 13 insertions(+) --- a/fs/xfs/linux-2.6/xfs_buf.c +++ b/fs/xfs/linux-2.6/xfs_buf.c @@ -187,6 +187,19 @@ free_address( { a_list_t*aentry; +#ifdef CONFIG_XEN + /* +* Xen needs to be able to make sure it can get an exclusive +* RO mapping of pages it wants to turn into a pagetable. If +* a newly allocated page is also still being vmap()ed by xfs, +* it will cause pagetable construction to fail. This is a +* quick workaround to always eagerly unmap pages so that Xen +* is happy. +*/ + vunmap(addr); + return; +#endif + aentry = kmalloc(sizeof(a_list_t), GFP_NOWAIT); if (likely(aentry)) { spin_lock(_lock); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 10/19] UML - kill subprocesses on exit
-stable review patch. If anyone has any objections, please let us know. -- From: Lepton Wu <[EMAIL PROTECTED]> commit a24864a1d52a97e345a6bd4862a057f98364d098 uml: definitively kill subprocesses on panic In a stock 2.6.22.6 kernel, poweroff a user mode linux guest (2.6.22.6 running in skas0 mode) will halt the host linux. I think the reason is the kernel thread abort because of a bug. Then the sys_reboot in process of user mode linux guest is not trapped by the user mode linux kernel and is executed by host. I think it is better to make sure all of our children process to quit when user mode linux kernel abort. [ jdike - the kernel process needs to ignore SIGTERM, plus the waitpid/kill loop is needed to make sure that all of our children are dead before the kernel exits ] Signed-off-by: Lepton Wu <[EMAIL PROTECTED]> Signed-off-by: Jeff Dike <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/um/os-Linux/skas/process.c |2 +- arch/um/os-Linux/util.c | 38 ++ 2 files changed, 39 insertions(+), 1 deletion(-) --- a/arch/um/os-Linux/skas/process.c +++ b/arch/um/os-Linux/skas/process.c @@ -182,7 +182,7 @@ static int userspace_tramp(void *stack) ptrace(PTRACE_TRACEME, 0, 0, 0); - init_new_thread_signals(); + signal(SIGTERM, SIG_DFL); err = set_interval(1); if(err) panic("userspace_tramp - setting timer failed, errno = %d\n", --- a/arch/um/os-Linux/util.c +++ b/arch/um/os-Linux/util.c @@ -105,6 +105,44 @@ int setjmp_wrapper(void (*proc)(void *, void os_dump_core(void) { + int pid; + signal(SIGSEGV, SIG_DFL); + + /* +* We are about to SIGTERM this entire process group to ensure that +* nothing is around to run after the kernel exits. The +* kernel wants to abort, not die through SIGTERM, so we +* ignore it here. +*/ + + signal(SIGTERM, SIG_IGN); + kill(0, SIGTERM); + /* +* Most of the other processes associated with this UML are +* likely sTopped, so give them a SIGCONT so they see the +* SIGTERM. +*/ + kill(0, SIGCONT); + + /* +* Now, having sent signals to everyone but us, make sure they +* die by ptrace. Processes can survive what's been done to +* them so far - the mechanism I understand is receiving a +* SIGSEGV and segfaulting immediately upon return. There is +* always a SIGSEGV pending, and (I'm guessing) signals are +* processed in numeric order so the SIGTERM (signal 15 vs +* SIGSEGV being signal 11) is never handled. +* +* Run a waitpid loop until we get some kind of error. +* Hopefully, it's ECHILD, but there's not a lot we can do if +* it's something else. Tell os_kill_ptraced_process not to +* wait for the child to report its death because there's +* nothing reasonable to do if that fails. +*/ + + while ((pid = waitpid(-1, NULL, WNOHANG)) > 0) + os_kill_ptraced_process(pid, 0); + abort(); } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 11/19] xen: add batch completion callbacks
-stable review patch. If anyone has any objections, please let us know. -- From: Jeremy Fitzhardinge <[EMAIL PROTECTED]> patch 91e0c5f3dad47838cb2ecc1865ce789a0b7182b1 in mainline. This adds a mechanism to register a callback function to be called once a batch of hypercalls has been issued. This is typically used to unlock things which must remain locked until the hypercall has taken place. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/i386/xen/multicalls.c | 29 ++--- arch/i386/xen/multicalls.h |3 +++ 2 files changed, 29 insertions(+), 3 deletions(-) --- a/arch/i386/xen/multicalls.c +++ b/arch/i386/xen/multicalls.c @@ -32,7 +32,11 @@ struct mc_buffer { struct multicall_entry entries[MC_BATCH]; u64 args[MC_ARGS]; - unsigned mcidx, argidx; + struct callback { + void (*fn)(void *); + void *data; + } callbacks[MC_BATCH]; + unsigned mcidx, argidx, cbidx; }; static DEFINE_PER_CPU(struct mc_buffer, mc_buffer); @@ -43,6 +47,7 @@ void xen_mc_flush(void) struct mc_buffer *b = &__get_cpu_var(mc_buffer); int ret = 0; unsigned long flags; + int i; BUG_ON(preemptible()); @@ -51,8 +56,6 @@ void xen_mc_flush(void) local_irq_save(flags); if (b->mcidx) { - int i; - if (HYPERVISOR_multicall(b->entries, b->mcidx) != 0) BUG(); for (i = 0; i < b->mcidx; i++) @@ -65,6 +68,13 @@ void xen_mc_flush(void) local_irq_restore(flags); + for(i = 0; i < b->cbidx; i++) { + struct callback *cb = >callbacks[i]; + + (*cb->fn)(cb->data); + } + b->cbidx = 0; + BUG_ON(ret); } @@ -88,3 +98,16 @@ struct multicall_space __xen_mc_entry(si return ret; } + +void xen_mc_callback(void (*fn)(void *), void *data) +{ + struct mc_buffer *b = &__get_cpu_var(mc_buffer); + struct callback *cb; + + if (b->cbidx == MC_BATCH) + xen_mc_flush(); + + cb = >callbacks[b->cbidx++]; + cb->fn = fn; + cb->data = data; +} --- a/arch/i386/xen/multicalls.h +++ b/arch/i386/xen/multicalls.h @@ -42,4 +42,7 @@ static inline void xen_mc_issue(unsigned local_irq_restore(x86_read_percpu(xen_mc_irq_flags)); } +/* Set up a callback to be called when the current batch is flushed */ +void xen_mc_callback(void (*fn)(void *), void *data); + #endif /* _XEN_MULTICALLS_H */ -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 12/19] xen: deal with stale cr3 values when unpinning pagetables
-stable review patch. If anyone has any objections, please let us know. -- From: Jeremy Fitzhardinge <[EMAIL PROTECTED]> patch 9f79991d4186089e228274196413572cc000143b in mainline. When a pagetable is no longer in use, it must be unpinned so that its pages can be freed. However, this is only possible if there are no stray uses of the pagetable. The code currently deals with all the usual cases, but there's a rare case where a vcpu is changing cr3, but is doing so lazily, and the change hasn't actually happened by the time the pagetable is unpinned, even though it appears to have been completed. This change adds a second per-cpu cr3 variable - xen_current_cr3 - which tracks the actual state of the vcpu cr3. It is only updated once the actual hypercall to set cr3 has been completed. Other processors wishing to unpin a pagetable can check other vcpu's xen_current_cr3 values to see if any cross-cpu IPIs are needed to clean things up. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/i386/xen/enlighten.c | 55 +++--- arch/i386/xen/mmu.c | 29 +--- arch/i386/xen/xen-ops.h |1 3 files changed, 65 insertions(+), 20 deletions(-) --- a/arch/i386/xen/enlighten.c +++ b/arch/i386/xen/enlighten.c @@ -56,7 +56,23 @@ DEFINE_PER_CPU(enum paravirt_lazy_mode, DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu); DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info); -DEFINE_PER_CPU(unsigned long, xen_cr3); + +/* + * Note about cr3 (pagetable base) values: + * + * xen_cr3 contains the current logical cr3 value; it contains the + * last set cr3. This may not be the current effective cr3, because + * its update may be being lazily deferred. However, a vcpu looking + * at its own cr3 can use this value knowing that it everything will + * be self-consistent. + * + * xen_current_cr3 contains the actual vcpu cr3; it is set once the + * hypercall to set the vcpu cr3 is complete (so it may be a little + * out of date, but it will never be set early). If one vcpu is + * looking at another vcpu's cr3 value, it should use this variable. + */ +DEFINE_PER_CPU(unsigned long, xen_cr3); /* cr3 stored as physaddr */ +DEFINE_PER_CPU(unsigned long, xen_current_cr3); /* actual vcpu cr3 */ struct start_info *xen_start_info; EXPORT_SYMBOL_GPL(xen_start_info); @@ -632,32 +648,36 @@ static unsigned long xen_read_cr3(void) return x86_read_percpu(xen_cr3); } +static void set_current_cr3(void *v) +{ + x86_write_percpu(xen_current_cr3, (unsigned long)v); +} + static void xen_write_cr3(unsigned long cr3) { + struct mmuext_op *op; + struct multicall_space mcs; + unsigned long mfn = pfn_to_mfn(PFN_DOWN(cr3)); + BUG_ON(preemptible()); - if (cr3 == x86_read_percpu(xen_cr3)) { - /* just a simple tlb flush */ - xen_flush_tlb(); - return; - } + mcs = xen_mc_entry(sizeof(*op)); /* disables interrupts */ + /* Update while interrupts are disabled, so its atomic with + respect to ipis */ x86_write_percpu(xen_cr3, cr3); + op = mcs.args; + op->cmd = MMUEXT_NEW_BASEPTR; + op->arg1.mfn = mfn; - { - struct mmuext_op *op; - struct multicall_space mcs = xen_mc_entry(sizeof(*op)); - unsigned long mfn = pfn_to_mfn(PFN_DOWN(cr3)); - - op = mcs.args; - op->cmd = MMUEXT_NEW_BASEPTR; - op->arg1.mfn = mfn; + MULTI_mmuext_op(mcs.mc, op, 1, NULL, DOMID_SELF); - MULTI_mmuext_op(mcs.mc, op, 1, NULL, DOMID_SELF); + /* Update xen_update_cr3 once the batch has actually + been submitted. */ + xen_mc_callback(set_current_cr3, (void *)cr3); - xen_mc_issue(PARAVIRT_LAZY_CPU); - } + xen_mc_issue(PARAVIRT_LAZY_CPU); /* interrupts restored */ } /* Early in boot, while setting up the initial pagetable, assume @@ -1113,6 +1133,7 @@ asmlinkage void __init xen_start_kernel( /* keep using Xen gdt for now; no urgent need to change it */ x86_write_percpu(xen_cr3, __pa(pgd)); + x86_write_percpu(xen_current_cr3, __pa(pgd)); #ifdef CONFIG_SMP /* Don't do the full vcpu_info placement stuff until we have a --- a/arch/i386/xen/mmu.c +++ b/arch/i386/xen/mmu.c @@ -515,20 +515,43 @@ static void drop_other_mm_ref(void *info if (__get_cpu_var(cpu_tlbstate).active_mm == mm) leave_mm(smp_processor_id()); + + /* If this cpu still has a stale cr3 reference, then make sure + it has been flushed. */ + if (x86_read_percpu(xen_current_cr3) == __pa(mm->pgd)) { + load_cr3(swapper_pg_dir); + arch_flush_lazy_cpu_mode(); + } } static void drop_mm_ref(struct mm_struct *mm) { + cpumask_t mask; +
[patch 08/19] UML - Fix kernel vs libc symbols clash
-stable review patch. If anyone has any objections, please let us know. -- From: Jeff Dike <[EMAIL PROTECTED]> commit 818f6ef407b448cef63294b9d0f6f8a2af9cb817 in mainline. uml: fix an IPV6 libc vs kernel symbol clash On some systems, with IPV6 configured, there is a clash between the kernel's in6addr_any and the one in libc. This is handled in the usual (gross) way of defining the kernel symbol out of the way on the gcc command line. Signed-off-by: Jeff Dike <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/um/Makefile |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/arch/um/Makefile +++ b/arch/um/Makefile @@ -60,7 +60,8 @@ SYS_DIR := $(ARCH_DIR)/include/sysdep-$ CFLAGS += $(CFLAGS-y) -D__arch_um__ -DSUBARCH=\"$(SUBARCH)\" \ $(ARCH_INCLUDE) $(MODE_INCLUDE) -Dvmap=kernel_vmap \ - -Din6addr_loopback=kernel_in6addr_loopback + -Din6addr_loopback=kernel_in6addr_loopback \ + -Din6addr_any=kernel_in6addr_any AFLAGS += $(ARCH_INCLUDE) -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 09/19] UML - stop using libc asm/user.h
-stable review patch. If anyone has any objections, please let us know. -- From: Jeff Dike <[EMAIL PROTECTED]> commit 189872f968def833727b6bfef83ebd7440c538e6 in mainline. uml: don't use glibc asm/user.h Stop including asm/user.h from libc - it seems to be disappearing from distros. It's replaced with sys/user.h which defines user_fpregs_struct and user_fpxregs_struct instead of user_i387_struct and struct user_fxsr_struct on i386. As a bonus, on x86_64, I get to dump some stupid typedefs which were needed in order to get asm/user.h to compile. Signed-off-by: Jeff Dike <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/um/sys-i386/user-offsets.c |6 +++--- arch/um/sys-x86_64/user-offsets.c |9 + 2 files changed, 4 insertions(+), 11 deletions(-) --- a/arch/um/sys-i386/user-offsets.c +++ b/arch/um/sys-i386/user-offsets.c @@ -2,9 +2,9 @@ #include #include #include +#include #include #include -#include #define DEFINE(sym, val) \ asm volatile("\n->" #sym " %0 " #val : : "i" (val)) @@ -48,8 +48,8 @@ void foo(void) OFFSET(HOST_SC_FP_ST, _fpstate, _st); OFFSET(HOST_SC_FXSR_ENV, _fpstate, _fxsr_env); - DEFINE_LONGS(HOST_FP_SIZE, sizeof(struct user_i387_struct)); - DEFINE_LONGS(HOST_XFP_SIZE, sizeof(struct user_fxsr_struct)); + DEFINE_LONGS(HOST_FP_SIZE, sizeof(struct user_fpregs_struct)); + DEFINE_LONGS(HOST_XFP_SIZE, sizeof(struct user_fpxregs_struct)); DEFINE(HOST_IP, EIP); DEFINE(HOST_SP, UESP); --- a/arch/um/sys-x86_64/user-offsets.c +++ b/arch/um/sys-x86_64/user-offsets.c @@ -3,17 +3,10 @@ #include #include #include +#include #define __FRAME_OFFSETS #include #include -/* For some reason, x86_64 defines u64 and u32 only in , which I - * refuse to include here, even though they're used throughout the headers. - * These are used in asm/user.h, and that include can't be avoided because of - * the sizeof(struct user_regs_struct) below. - */ -typedef __u64 u64; -typedef __u32 u32; -#include #define DEFINE(sym, val) \ asm volatile("\n->" #sym " %0 " #val : : "i" (val)) -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 06/19] POWERPC: Make sure to of_node_get() the result of pci_device_to_OF_node()
-stable review patch. If anyone has any objections, please let us know. -- From: Michael Ellerman <[EMAIL PROTECTED]> patch db220b234da9f183b127b9c3077c253b94756e35 in mainline. pci_device_to_OF_node() returns the device node attached to a PCI device, but doesn't actually grab a reference - we need to do it ourselves. Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]> Acked-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/powerpc/platforms/cell/axon_msi.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/arch/powerpc/platforms/cell/axon_msi.c +++ b/arch/powerpc/platforms/cell/axon_msi.c @@ -126,7 +126,7 @@ static struct axon_msic *find_msi_transl const phandle *ph; struct axon_msic *msic = NULL; - dn = pci_device_to_OF_node(dev); + dn = of_node_get(pci_device_to_OF_node(dev)); if (!dn) { dev_dbg(>dev, "axon_msi: no pci_dn found\n"); return NULL; @@ -183,7 +183,7 @@ static int setup_msi_msg_address(struct int len; const u32 *prop; - dn = pci_device_to_OF_node(dev); + dn = of_node_get(pci_device_to_OF_node(dev)); if (!dn) { dev_dbg(>dev, "axon_msi: no pci_dn found\n"); return -ENODEV; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 07/19] UML - Stop using libc asm/page.h
-stable review patch. If anyone has any objections, please let us know. -- From: Jeff Dike <[EMAIL PROTECTED]> commit 71f926f2ea61994470a53c9e11d3ef993197cada in mainline. uml: stop using libc asm/page.h Remove includes of asm/page.h from libc code. This header seems to be disappearing, and UML doesn't make much use of it anyway. The one use, PAGE_SHIFT in stub.h, is handled by copying the constant from the kernel side of the house in common_offsets.h. [ jdike - added arch/um/kernel/skas/clone.c for -stable ] Signed-off-by: Jeff Dike <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/um/include/common-offsets.h |1 + arch/um/include/sysdep-i386/stub.h |3 +-- arch/um/kernel/skas/clone.c|1 - arch/um/os-Linux/main.c|1 - arch/um/os-Linux/skas/mem.c|1 - arch/um/os-Linux/start_up.c|1 - arch/um/os-Linux/tt.c |1 - 7 files changed, 2 insertions(+), 7 deletions(-) --- a/arch/um/include/common-offsets.h +++ b/arch/um/include/common-offsets.h @@ -10,6 +10,7 @@ OFFSET(HOST_TASK_PID, task_struct, pid); DEFINE(UM_KERN_PAGE_SIZE, PAGE_SIZE); DEFINE(UM_KERN_PAGE_MASK, PAGE_MASK); +DEFINE(UM_KERN_PAGE_SHIFT, PAGE_SHIFT); DEFINE(UM_NSEC_PER_SEC, NSEC_PER_SEC); DEFINE_STR(UM_KERN_EMERG, KERN_EMERG); --- a/arch/um/include/sysdep-i386/stub.h +++ b/arch/um/include/sysdep-i386/stub.h @@ -9,7 +9,6 @@ #include #include #include -#include #include "stub-data.h" #include "kern_constants.h" #include "uml-config.h" @@ -19,7 +18,7 @@ extern void stub_clone_handler(void); #define STUB_SYSCALL_RET EAX #define STUB_MMAP_NR __NR_mmap2 -#define MMAP_OFFSET(o) ((o) >> PAGE_SHIFT) +#define MMAP_OFFSET(o) ((o) >> UM_KERN_PAGE_SHIFT) static inline long stub_syscall0(long syscall) { --- a/arch/um/kernel/skas/clone.c +++ b/arch/um/kernel/skas/clone.c @@ -3,7 +3,6 @@ #include #include #include -#include #include "ptrace_user.h" #include "skas.h" #include "stub-data.h" --- a/arch/um/os-Linux/main.c +++ b/arch/um/os-Linux/main.c @@ -12,7 +12,6 @@ #include #include #include -#include #include "kern_util.h" #include "as-layout.h" #include "mem_user.h" --- a/arch/um/os-Linux/skas/mem.c +++ b/arch/um/os-Linux/skas/mem.c @@ -9,7 +9,6 @@ #include #include #include -#include #include #include "mem_user.h" #include "mem.h" --- a/arch/um/os-Linux/start_up.c +++ b/arch/um/os-Linux/start_up.c @@ -19,7 +19,6 @@ #include #include #include -#include #include #include "kern_util.h" #include "user.h" --- a/arch/um/os-Linux/tt.c +++ b/arch/um/os-Linux/tt.c @@ -17,7 +17,6 @@ #include #include #include -#include #include "kern_util.h" #include "user.h" #include "signal_kern.h" -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 04/19] MIPS: R1: Fix hazard barriers to make kernels work on R2 also.
-stable review patch. If anyone has any objections, please let us know. -- From: Ralf Baechle <[EMAIL PROTECTED]> patch 572afc248c33c902760f6f24a72c180f0e4f1719 in mainline. Tested with Malta; inflates malta_defconfig by 3932 bytes. Ideally there should be additional configuration to allow getting rid of this overhead but that would be too much complexity at this stage of the release cycle. Signed-off-by: Ralf Baechle <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- include/asm-mips/hazards.h | 54 - 1 file changed, 53 insertions(+), 1 deletion(-) --- a/include/asm-mips/hazards.h +++ b/include/asm-mips/hazards.h @@ -10,11 +10,12 @@ #ifndef _ASM_HAZARDS_H #define _ASM_HAZARDS_H - #ifdef __ASSEMBLY__ #define ASMMACRO(name, code...) .macro name; code; .endm #else +#include + #define ASMMACRO(name, code...) \ __asm__(".macro " #name "; " #code "; .endm"); \ \ @@ -86,6 +87,57 @@ do { \ : "=r" (tmp)); \ } while (0) +#elif defined(CONFIG_CPU_MIPSR1) + +/* + * These are slightly complicated by the fact that we guarantee R1 kernels to + * run fine on R2 processors. + */ +ASMMACRO(mtc0_tlbw_hazard, + _ssnop; _ssnop; _ehb + ) +ASMMACRO(tlbw_use_hazard, + _ssnop; _ssnop; _ssnop; _ehb + ) +ASMMACRO(tlb_probe_hazard, +_ssnop; _ssnop; _ssnop; _ehb + ) +ASMMACRO(irq_enable_hazard, +_ssnop; _ssnop; _ssnop; _ehb + ) +ASMMACRO(irq_disable_hazard, + _ssnop; _ssnop; _ssnop; _ehb + ) +ASMMACRO(back_to_back_c0_hazard, +_ssnop; _ssnop; _ssnop; _ehb + ) +/* + * gcc has a tradition of misscompiling the previous construct using the + * address of a label as argument to inline assembler. Gas otoh has the + * annoying difference between la and dla which are only usable for 32-bit + * rsp. 64-bit code, so can't be used without conditional compilation. + * The alterantive is switching the assembler to 64-bit code which happens + * to work right even for 32-bit code ... + */ +#define __instruction_hazard() \ +do { \ + unsigned long tmp; \ + \ + __asm__ __volatile__( \ + " .setmips64r2\n" \ + " dla %0, 1f \n" \ + " jr.hb %0 \n" \ + " .setmips0 \n" \ + "1: \n" \ + : "=r" (tmp)); \ +} while (0) + +#define instruction_hazard() \ +do { \ + if (cpu_has_mips_r2)\ + __instruction_hazard(); \ +} while (0) + #elif defined(CONFIG_CPU_R1) /* -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 05/19] POWERPC: Fix handling of stfiwx math emulation
-stable review patch. If anyone has any objections, please let us know. -- From: Kumar Gala <[EMAIL PROTECTED]> patch ba02946a903015840ef672ccc9dc8620a7e83de6 in mainline Its legal for the stfiwx instruction to have RA = 0 as part of its effective address calculation. This is illegal for all other XE form instructions. Add code to compute the proper effective address for stfiwx if RA = 0 rather than treating it as illegal. Signed-off-by: Kumar Gala <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/powerpc/math-emu/math.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) --- a/arch/powerpc/math-emu/math.c +++ b/arch/powerpc/math-emu/math.c @@ -407,11 +407,16 @@ do_mathemu(struct pt_regs *regs) case XE: idx = (insn >> 16) & 0x1f; - if (!idx) - goto illegal; - op0 = (void *)>thread.fpr[(insn >> 21) & 0x1f]; - op1 = (void *)(regs->gpr[idx] + regs->gpr[(insn >> 11) & 0x1f]); + if (!idx) { + if (((insn >> 1) & 0x3ff) == STFIWX) + op1 = (void *)(regs->gpr[(insn >> 11) & 0x1f]); + else + goto illegal; + } else { + op1 = (void *)(regs->gpr[idx] + regs->gpr[(insn >> 11) & 0x1f]); + } + break; case XEU: -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 02/19] Fix sparc64 MAP_FIXED handling of framebuffer mmaps
-stable review patch. If anyone has any objections, please let us know. -- From: Chris Wright <[EMAIL PROTECTED]> patch d58aa8c7b1cc0add7b03e26bdb8988d98d2f4cd1 in mainline. From: Chris Wright <[EMAIL PROTECTED]> Date: Tue, 23 Oct 2007 20:36:14 -0700 Subject: [patch 02/19] [PATCH] [SPARC64]: pass correct addr in get_fb_unmapped_area(MAP_FIXED) Looks like the MAP_FIXED case is using the wrong address hint. I'd expect the comment "don't mess with it" means pass the request straight on through, not change the address requested to -ENOMEM. Signed-off-by: Chris Wright <[EMAIL PROTECTED]> Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/sparc64/kernel/sys_sparc.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/sparc64/kernel/sys_sparc.c +++ b/arch/sparc64/kernel/sys_sparc.c @@ -319,7 +319,7 @@ unsigned long get_fb_unmapped_area(struc if (flags & MAP_FIXED) { /* Ok, don't mess with it. */ - return get_unmapped_area(NULL, addr, len, pgoff, flags); + return get_unmapped_area(NULL, orig_addr, len, pgoff, flags); } flags &= ~MAP_SHARED; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 03/19] MIPS: MT: Fix bug in multithreaded kernels.
-stable review patch. If anyone has any objections, please let us know. -- From: Ralf Baechle <[EMAIL PROTECTED]> patch a76ab5c10d99bdf458067cb495e72c0ee5f09909 in mainline. When GDB writes a breakpoint into address area of inferior process the kernel needs to invalidate the modified memory in the inferior which is done by calling flush_cache_page which in turns calls r4k_flush_cache_page and local_r4k_flush_cache_page for VSMP or SMTC kernel via r4k_on_each_cpu(). As the VSMP and SMTC SMP kernels for 34K are running on a single shared caches it is possible to get away without interprocessor function calls. This optimization is implemented in r4k_on_each_cpu, so local_r4k_flush_cache_page is only ever called on the local CPU. This is where the following code in local_r4k_flush_cache_page() strikes: /* * If ownes no valid ASID yet, cannot possibly have gotten * this page into the cache. */ if (cpu_context(smp_processor_id(), mm) == 0) return; On VSMP and SMTC had a function of cpu_context() for each CPU(TC). So in case another CPU than the CPU executing local_r4k_cache_flush_page has not accessed the mm but one of the other CPUs has there may be data to be flushed in the cache yet local_r4k_cache_flush_page will falsely return leaving the I-cache inconsistent for the breakpoint. While the issue was discovered with GDB it also exists in local_r4k_flush_cache_range() and local_r4k_flush_cache(). Fixed by introducing a new function has_valid_asid which on MT kernels returns true if a mm is active on any processor in the system. This is relativly expensive since for memory acccesses in that loop cache misses have to be assumed but it seems the most viable solution for 2.6.23 and older -stable kernels. Signed-off-by: Ralf Baechle <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/mips/mm/c-r4k.c | 21 ++--- 1 file changed, 18 insertions(+), 3 deletions(-) --- a/arch/mips/mm/c-r4k.c +++ b/arch/mips/mm/c-r4k.c @@ -360,11 +360,26 @@ static void r4k___flush_cache_all(void) r4k_on_each_cpu(local_r4k___flush_cache_all, NULL, 1, 1); } +static inline int has_valid_asid(const struct mm_struct *mm) +{ +#if defined(CONFIG_MIPS_MT_SMP) || defined(CONFIG_MIPS_MT_SMTC) + int i; + + for_each_online_cpu(i) + if (cpu_context(i, mm)) + return 1; + + return 0; +#else + return cpu_context(smp_processor_id(), mm); +#endif +} + static inline void local_r4k_flush_cache_range(void * args) { struct vm_area_struct *vma = args; - if (!(cpu_context(smp_processor_id(), vma->vm_mm))) + if (!(has_valid_asid(vma->vm_mm))) return; r4k_blast_dcache(); @@ -383,7 +398,7 @@ static inline void local_r4k_flush_cache { struct mm_struct *mm = args; - if (!cpu_context(smp_processor_id(), mm)) + if (!has_valid_asid(mm)) return; /* @@ -434,7 +449,7 @@ static inline void local_r4k_flush_cache * If ownes no valid ASID yet, cannot possibly have gotten * this page into the cache. */ - if (cpu_context(smp_processor_id(), mm) == 0) + if (!has_valid_asid(mm)) return; addr &= PAGE_MASK; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 01/19] Fix sparc64 niagara optimized RAID xor asm
-stable review patch. If anyone has any objections, please let us know. -- From: David Miller <[EMAIL PROTECTED]> patch d060db63fd38a8a75f666576efc28cdc31cf in mainline. [SPARC64]: Fix register usage in xor_raid_4(). Some typos led to using %i6/%i7 instead of %l6/%l7 in loads which is really really bad because those are the frame pointer and return PC. Based upon a raid5 crash report by Bertrand Joel. Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/sparc64/lib/xor.S | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) --- a/arch/sparc64/lib/xor.S +++ b/arch/sparc64/lib/xor.S @@ -491,12 +491,12 @@ xor_niagara_4:/* %o0=bytes, %o1=dest, ldda[%i1 + 0x10] %asi, %i2 /* %i2/%i3 = src1 + 0x10 */ xor %g2, %i4, %g2 xor %g3, %i5, %g3 - ldda[%i7 + 0x10] %asi, %i4 /* %i4/%i5 = src2 + 0x10 */ + ldda[%l7 + 0x10] %asi, %i4 /* %i4/%i5 = src2 + 0x10 */ xor %l0, %g2, %l0 xor %l1, %g3, %l1 stxa%l0, [%i0 + 0x00] %asi stxa%l1, [%i0 + 0x08] %asi - ldda[%i6 + 0x10] %asi, %g2 /* %g2/%g3 = src3 + 0x10 */ + ldda[%l6 + 0x10] %asi, %g2 /* %g2/%g3 = src3 + 0x10 */ ldda[%i0 + 0x10] %asi, %l0 /* %l0/%l1 = dest + 0x10 */ xor %i4, %i2, %i4 @@ -504,12 +504,12 @@ xor_niagara_4:/* %o0=bytes, %o1=dest, ldda[%i1 + 0x20] %asi, %i2 /* %i2/%i3 = src1 + 0x20 */ xor %g2, %i4, %g2 xor %g3, %i5, %g3 - ldda[%i7 + 0x20] %asi, %i4 /* %i4/%i5 = src2 + 0x20 */ + ldda[%l7 + 0x20] %asi, %i4 /* %i4/%i5 = src2 + 0x20 */ xor %l0, %g2, %l0 xor %l1, %g3, %l1 stxa%l0, [%i0 + 0x10] %asi stxa%l1, [%i0 + 0x18] %asi - ldda[%i6 + 0x20] %asi, %g2 /* %g2/%g3 = src3 + 0x20 */ + ldda[%l6 + 0x20] %asi, %g2 /* %g2/%g3 = src3 + 0x20 */ ldda[%i0 + 0x20] %asi, %l0 /* %l0/%l1 = dest + 0x20 */ xor %i4, %i2, %i4 @@ -517,12 +517,12 @@ xor_niagara_4:/* %o0=bytes, %o1=dest, ldda[%i1 + 0x30] %asi, %i2 /* %i2/%i3 = src1 + 0x30 */ xor %g2, %i4, %g2 xor %g3, %i5, %g3 - ldda[%i7 + 0x30] %asi, %i4 /* %i4/%i5 = src2 + 0x30 */ + ldda[%l7 + 0x30] %asi, %i4 /* %i4/%i5 = src2 + 0x30 */ xor %l0, %g2, %l0 xor %l1, %g3, %l1 stxa%l0, [%i0 + 0x20] %asi stxa%l1, [%i0 + 0x28] %asi - ldda[%i6 + 0x30] %asi, %g2 /* %g2/%g3 = src3 + 0x30 */ + ldda[%l6 + 0x30] %asi, %g2 /* %g2/%g3 = src3 + 0x30 */ ldda[%i0 + 0x30] %asi, %l0 /* %l0/%l1 = dest + 0x30 */ prefetch[%i1 + 0x40], #one_read -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 00/19] 2.6.23-stable review, arch specific stuff
This is the start of the stable review cycle for the 2.6.23.X release. There are 19 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let us know. If anyone is a maintainer of the proper subsystem, and wants to add a Signed-off-by: line to the patch, please respond with it. These patches are sent out with a number of different people on the Cc: line. If you wish to be a reviewer, please email [EMAIL PROTECTED] to add your name to the list. If you want to be off the reviewer list, also email us. Responses should be made by Friday 00:00:00 UTC. Anything received after that time might be too late. This set of patches focuses on arch specific changes. The diffstat of this review series is included below. thanks, greg k-h - arch/i386/boot/boot.h |8 ++-- arch/i386/boot/header.S| 62 +++-- arch/i386/boot/video-bios.c|2 - arch/i386/boot/video-vesa.c|2 - arch/i386/boot/video.c |2 - arch/i386/kernel/tsc.c |5 +- arch/i386/xen/enlighten.c | 57 -- arch/i386/xen/mmu.c| 29 +-- arch/i386/xen/multicalls.c | 29 +-- arch/i386/xen/multicalls.h |3 + arch/i386/xen/xen-ops.h|1 arch/mips/mm/c-r4k.c | 21 +-- arch/powerpc/math-emu/math.c | 13 -- arch/powerpc/platforms/cell/axon_msi.c |4 +- arch/sparc64/kernel/sys_sparc.c|2 - arch/sparc64/lib/xor.S | 12 +++--- arch/um/Makefile |3 + arch/um/include/common-offsets.h |1 arch/um/include/sysdep-i386/stub.h |3 - arch/um/kernel/skas/clone.c|1 arch/um/os-Linux/main.c|1 arch/um/os-Linux/skas/mem.c|1 arch/um/os-Linux/skas/process.c|2 - arch/um/os-Linux/start_up.c|1 arch/um/os-Linux/tt.c |1 arch/um/os-Linux/util.c| 38 arch/um/sys-i386/user-offsets.c|6 +-- arch/um/sys-x86_64/user-offsets.c |9 arch/x86_64/mm/init.c |6 --- arch/x86_64/mm/pageattr.c |9 +++- fs/xfs/linux-2.6/xfs_buf.c | 13 ++ include/asm-mips/hazards.h | 54 include/linux/bootmem.h|1 include/xen/interface/vcpu.h |5 +- mm/sparse.c| 11 - 35 files changed, 307 insertions(+), 111 deletions(-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Moderated list
On 15-11-07 00:23, David Miller wrote: From: Takashi Iwai <[EMAIL PROTECTED]> BTW, I also prefer keeping the name [EMAIL PROTECTED] It's been so. That's fine with me, I've changed it [EMAIL PROTECTED] Great, thanks. Jaroslav -- given that this list won't need moderation I'd consider it the main/only alsa-devel. The alsa-devel subscriber database was cleansed only a couple of months ago when moving from sourceforge so it should now be okay to just transfer all subscriptions. Or maybe you're already moving things; mailman.alsa-project.org seems to be down at least Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 10/13] sched: keep utime/stime monotonic
-stable review patch. If anyone has any objections, please let us know. -- From: Frans Pop <[EMAIL PROTECTED]> sched: keep utime/stime monotonic cpustats use utime/stime as a ratio against sum_exec_runtime, as a consequence it can happen - when the ratio changes faster than time accumulates - that either can be appear to go backwards. Combined backport for 2.6.23 of the following patches from mainline: commit 73a2bcb0edb9ffb0b007b3546b430e2c6e415eee Author: Peter Zijlstra <[EMAIL PROTECTED]> sched: keep utime/stime monotonic commit 9301899be75b464ef097f0b5af7af6d9bd8f68a7 Author: Balbir Singh <[EMAIL PROTECTED]> sched: fix /proc//stat stime/utime monotonicity, part 2 Signed-off-by: Frans Pop <[EMAIL PROTECTED]> CC: Peter Zijlstra <[EMAIL PROTECTED]> CC: Balbir Singh <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- fs/proc/array.c |6 -- include/linux/sched.h |1 + kernel/fork.c |2 ++ 3 files changed, 7 insertions(+), 2 deletions(-) --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -351,7 +351,8 @@ static cputime_t task_utime(struct task_ } utime = (clock_t)temp; - return clock_t_to_cputime(utime); + p->prev_utime = max(p->prev_utime, clock_t_to_cputime(utime)); + return p->prev_utime; } static cputime_t task_stime(struct task_struct *p) @@ -366,7 +367,8 @@ static cputime_t task_stime(struct task_ stime = nsec_to_clock_t(p->se.sum_exec_runtime) - cputime_to_clock_t(task_utime(p)); - return clock_t_to_cputime(stime); + p->prev_stime = max(p->prev_stime, clock_t_to_cputime(stime)); + return p->prev_stime; } #endif --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1022,6 +1022,7 @@ struct task_struct { unsigned int rt_priority; cputime_t utime, stime; + cputime_t prev_utime, prev_stime; unsigned long nvcsw, nivcsw; /* context switch counts */ struct timespec start_time; /* monotonic time */ struct timespec real_start_time;/* boot based time */ --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1045,6 +1045,8 @@ static struct task_struct *copy_process( p->utime = cputime_zero; p->stime = cputime_zero; + p->prev_utime = cputime_zero; + p->prev_stime = cputime_zero; #ifdef CONFIG_TASK_XACCT p->rchar = 0; /* I/O counter: bytes read */ -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 12/13] fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE
-stable review patch. If anyone has any objections, please let us know. -- From: Hugh Dickins <[EMAIL PROTECTED]> patch 487e9bf25cbae11b131d6a14bdbb3a6a77380837 in mainline. It's possible to provoke unionfs (not yet in mainline, though in mm and some distros) to hit shmem_writepage's BUG_ON(page_mapped(page)). I expect it's possible to provoke the 2.6.23 ecryptfs in the same way (but the 2.6.24 ecryptfs no longer calls lower level's ->writepage). This came to light with the recent find that AOP_WRITEPAGE_ACTIVATE could leak from tmpfs via write_cache_pages and unionfs to userspace. There's already a fix (e423003028183df54f039dfda8b58c49e78c89d7 - writeback: don't propagate AOP_WRITEPAGE_ACTIVATE) in the tree for that, and it's okay so far as it goes; but insufficient because it doesn't address the underlying issue, that shmem_writepage expects to be called only by vmscan (relying on backing_dev_info capabilities to prevent the normal writeback path from ever approaching it). That's an increasingly fragile assumption, and ramdisk_writepage (the other source of AOP_WRITEPAGE_ACTIVATEs) is already careful to check wbc->for_reclaim before returning it. Make the same check in shmem_writepage, thereby sidestepping the page_mapped BUG also. Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]> Cc: Erez Zadok <[EMAIL PROTECTED]> Reviewed-by: Pekka Enberg <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- mm/shmem.c | 15 +++ 1 file changed, 15 insertions(+) --- a/mm/shmem.c +++ b/mm/shmem.c @@ -916,6 +916,21 @@ static int shmem_writepage(struct page * struct inode *inode; BUG_ON(!PageLocked(page)); + /* +* shmem_backing_dev_info's capabilities prevent regular writeback or +* sync from ever calling shmem_writepage; but a stacking filesystem +* may use the ->writepage of its underlying filesystem, in which case +* we want to do nothing when that underlying filesystem is tmpfs +* (writing out to swap is useful as a response to memory pressure, but +* of no use to stabilize the data) - just redirty the page, unlock it +* and claim success in this case. AOP_WRITEPAGE_ACTIVATE, and the +* page_mapped check below, must be avoided unless we're in reclaim. +*/ + if (!wbc->for_reclaim) { + set_page_dirty(page); + unlock_page(page); + return 0; + } BUG_ON(page_mapped(page)); mapping = page->mapping; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 13/13] BLOCK: Fix bad sharing of tag busy list on queues with shared tag maps
-stable review patch. If anyone has any objections, please let us know. -- From: Jens Axboe <[EMAIL PROTECTED]> patch 6eca9004dfcb274a502438a591df5b197690afb1 in mainline. For the locking to work, only the tag map and tag bit map may be shared (incidentally, I was just explaining this to Nick yesterday, but I apparently didn't review the code well enough myself). But we also share the busy list! The busy_list must be queue private, or we need a block_queue_tag covering lock as well. So we have to move the busy_list to the queue. This'll work fine, and it'll actually also fix a problem with blk_queue_invalidate_tags() which will invalidate tags across all shared queues. This is a bit confusing, the low level driver should call it for each queue seperately since otherwise you cannot kill tags on just a single queue for eg a hard drive that stops responding. Since the function has no callers currently, it's not an issue. This is fixed with commit 6eca9004dfcb274a502438a591df5b197690afb1 in Linus' tree. Signed-off-by: Jens Axboe <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- block/ll_rw_blk.c |8 +++- include/linux/blkdev.h |2 +- 2 files changed, 4 insertions(+), 6 deletions(-) --- a/block/ll_rw_blk.c +++ b/block/ll_rw_blk.c @@ -819,7 +819,6 @@ static int __blk_free_tags(struct blk_qu retval = atomic_dec_and_test(>refcnt); if (retval) { BUG_ON(bqt->busy); - BUG_ON(!list_empty(>busy_list)); kfree(bqt->tag_index); bqt->tag_index = NULL; @@ -931,7 +930,6 @@ static struct blk_queue_tag *__blk_queue if (init_tag_map(q, tags, depth)) goto fail; - INIT_LIST_HEAD(>busy_list); tags->busy = 0; atomic_set(>refcnt, 1); return tags; @@ -982,6 +980,7 @@ int blk_queue_init_tags(struct request_q */ q->queue_tags = tags; q->queue_flags |= (1 << QUEUE_FLAG_QUEUED); + INIT_LIST_HEAD(>tag_busy_list); return 0; fail: kfree(tags); @@ -1152,7 +1151,7 @@ int blk_queue_start_tag(struct request_q rq->tag = tag; bqt->tag_index[tag] = rq; blkdev_dequeue_request(rq); - list_add(>queuelist, >busy_list); + list_add(>queuelist, >tag_busy_list); bqt->busy++; return 0; } @@ -1173,11 +1172,10 @@ EXPORT_SYMBOL(blk_queue_start_tag); **/ void blk_queue_invalidate_tags(struct request_queue *q) { - struct blk_queue_tag *bqt = q->queue_tags; struct list_head *tmp, *n; struct request *rq; - list_for_each_safe(tmp, n, >busy_list) { + list_for_each_safe(tmp, n, >tag_busy_list) { rq = list_entry_rq(tmp); if (rq->tag == -1) { --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -356,7 +356,6 @@ enum blk_queue_state { struct blk_queue_tag { struct request **tag_index; /* map of busy tags */ unsigned long *tag_map; /* bit map of free/busy tags */ - struct list_head busy_list; /* fifo list of busy tags */ int busy; /* current depth */ int max_depth; /* what we will send to device */ int real_max_depth; /* what the array can hold */ @@ -451,6 +450,7 @@ struct request_queue unsigned intdma_alignment; struct blk_queue_tag*queue_tags; + struct list_headtag_busy_list; unsigned intnr_sorted; unsigned intin_flight; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 09/13] fix the softlockup watchdog to actually work
-stable review patch. If anyone has any objections, please let us know. -- From: Ingo Molnar <[EMAIL PROTECTED]> patch a115d5caca1a2905ba7a32b408a6042b20179aaa in mainline. this Xen related commit: commit 966812dc98e6a7fcdf759cbfa0efab77500a8868 Author: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Date: Tue May 8 00:28:02 2007 -0700 Ignore stolen time in the softlockup watchdog broke the softlockup watchdog to never report any lockups. (!) print_timestamp defaults to 0, this makes the following condition always true: if (print_timestamp < (touch_timestamp + 1) || and we'll in essence never report soft lockups. apparently the functionality of the soft lockup watchdog was never actually tested with that patch applied ... Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> Cc: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- kernel/softlockup.c |7 --- 1 file changed, 4 insertions(+), 3 deletions(-) --- a/kernel/softlockup.c +++ b/kernel/softlockup.c @@ -80,10 +80,11 @@ void softlockup_tick(void) print_timestamp = per_cpu(print_timestamp, this_cpu); /* report at most once a second */ - if (print_timestamp < (touch_timestamp + 1) || - did_panic || - !per_cpu(watchdog_task, this_cpu)) + if ((print_timestamp >= touch_timestamp && + print_timestamp < (touch_timestamp + 1)) || + did_panic || !per_cpu(watchdog_task, this_cpu)) { return; + } /* do not print during early bootup: */ if (unlikely(system_state != SYSTEM_RUNNING)) { -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 11/13] Fix compat futex hangs.
-stable review patch. If anyone has any objections, please let us know. -- From: David Miller <[EMAIL PROTECTED]> [FUTEX]: Fix address computation in compat code. [ Upstream commit: 3c5fd9c77d609b51c0bab682c9d40cbb496ec6f1 ] compat_exit_robust_list() computes a pointer to the futex entry in userspace as follows: (void __user *)entry + futex_offset 'entry' is a 'struct robust_list __user *', and 'futex_offset' is a 'compat_long_t' (typically a 's32'). Things explode if the 32-bit sign bit is set in futex_offset. Type promotion sign extends futex_offset to a 64-bit value before adding it to 'entry'. This triggered a problem on sparc64 running 32-bit applications which would lock up a cpu looping forever in the fault handling for the userspace load in handle_futex_death(). Compat userspace runs with address masking (wherein the cpu zeros out the top 32-bits of every effective address given to a memory operation instruction) so the sparc64 fault handler accounts for this by zero'ing out the top 32-bits of the fault address too. Since the kernel properly uses the compat_uptr interfaces, kernel side accesses to compat userspace work too since they will only use addresses with the top 32-bit clear. Because of this compat futex layer bug we get into the following loop when executing the get_user() load near the top of handle_futex_death(): 1) load from address '0xf7f16bd8', FAULT 2) fault handler clears upper 32-bits, processes fault for address '0xf7f16bd8' which succeeds 3) goto #1 I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto for their tireless efforts helping me track down this bug. Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- kernel/futex_compat.c | 27 --- 1 file changed, 20 insertions(+), 7 deletions(-) --- a/kernel/futex_compat.c +++ b/kernel/futex_compat.c @@ -29,6 +29,15 @@ fetch_robust_entry(compat_uptr_t *uentry return 0; } +static void __user *futex_uaddr(struct robust_list *entry, + compat_long_t futex_offset) +{ + compat_uptr_t base = ptr_to_compat(entry); + void __user *uaddr = compat_ptr(base + futex_offset); + + return uaddr; +} + /* * Walk curr->robust_list (very carefully, it's a userspace list!) * and mark any locks found there dead, and notify any waiters. @@ -75,11 +84,13 @@ void compat_exit_robust_list(struct task * A pending lock might already be on the list, so * dont process it twice: */ - if (entry != pending) - if (handle_futex_death((void __user *)entry + futex_offset, - curr, pi)) - return; + if (entry != pending) { + void __user *uaddr = futex_uaddr(entry, +futex_offset); + if (handle_futex_death(uaddr, curr, pi)) + return; + } if (rc) return; uentry = next_uentry; @@ -93,9 +104,11 @@ void compat_exit_robust_list(struct task cond_resched(); } - if (pending) - handle_futex_death((void __user *)pending + futex_offset, - curr, pip); + if (pending) { + void __user *uaddr = futex_uaddr(pending, futex_offset); + + handle_futex_death(uaddr, curr, pip); + } } asmlinkage long -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 07/13] writeback: dont propagate AOP_WRITEPAGE_ACTIVATE
-stable review patch. If anyone has any objections, please let us know. -- From: Andrew Morton <[EMAIL PROTECTED]> patch e423003028183df54f039dfda8b58c49e78c89d7 in mainline. This is a writeback-internal marker but we're propagating it all the way back to userspace!. Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- mm/page-writeback.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -672,8 +672,10 @@ retry: ret = (*writepage)(page, wbc, data); - if (unlikely(ret == AOP_WRITEPAGE_ACTIVATE)) + if (unlikely(ret == AOP_WRITEPAGE_ACTIVATE)) { unlock_page(page); + ret = 0; + } if (ret || (--(wbc->nr_to_write) <= 0)) done = 1; if (wbc->nonblocking && bdi_write_congested(bdi)) { -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 08/13] splice: fix double kunmap() in vmsplice copy path
-stable review patch. If anyone has any objections, please let us know. -- From: Jens Axboe <[EMAIL PROTECTED]> patch 6866bef40d06f7c2baac3a855b1917a8ca75456c in mainline. The out label should not include the unmap, the only way to jump there already has unmapped the source. 2000 f7c21a00 c0489036 00018e32 0002 1000 Call Trace: [] pipe_to_user+0xca/0xd3 [] __splice_from_pipe+0x53/0x1bd [] [ cut here ] filemap_fault+0x221/0x380 [] pipe_to_user+0x0/0xd3 [] sys_vmsplice+0x3b7/0x422 [] kernel BUG at mm/highmem.c:206! handle_mm_fault+0x4d5/0x8eb [] kmap_atomic+0x1c/0x20 [] unmap_vmas+0x3d1/0x584 [] free_pgtables+0x90/0xa0 [] pgd_dtor+0x0/0x1 [] audit_syscall_exit+0x2aa/0x2c6 [] do_syscall_trace+0x124/0x169 [] syscall_call+0x7/0xb === Code: 2d 00 d0 5b 00 25 00 00 e0 ff 29 invalid opcode: [#1] c2 89 d0 c1 e8 0c 8b 14 85 a0 6c 7c c0 4a 85 d2 89 14 85 a0 6c 7c c0 74 07 31 c9 4a 75 15 eb 04 <0f> 0b eb fe 31 c9 81 3d 78 38 6d c0 78 38 6d c0 0f 95 c1 b0 01 EIP: [] kunmap_high+0x51/0x8e SS:ESP 0068:f5960df0 SMP Modules linked in: netconsole autofs4 hidp nfs lockd nfs_acl rfcomm l2cap bluetooth sunrpc ipv6 ib_iser rdma_cm ib_cm iw_cmib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_multipath dm_mod video output sbs batteryac parport_pc lp parport sg i2c_piix4 i2c_core floppy cfi_probe gen_probe scb2_flash mtd chipreg tg3 e1000 button ide_cd serio_raw cdrom aic7xxx scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd CPU:3 EIP:0060:[]Not tainted VLI EFLAGS: 00010246 (2.6.23 #1) EIP is at kunmap_high+0x51/0x8e Signed-off-by: Jens Axboe <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- fs/splice.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/fs/splice.c +++ b/fs/splice.c @@ -1390,10 +1390,10 @@ static int pipe_to_user(struct pipe_inod if (copy_to_user(sd->u.userptr, src + buf->offset, sd->len)) ret = -EFAULT; + buf->ops->unmap(pipe, buf, src); out: if (ret > 0) sd->u.userptr += ret; - buf->ops->unmap(pipe, buf, src); return ret; } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 05/13] HOWTO: update ja_JP/HOWTO with latest changes
-stable review patch. If anyone has any objections, please let us know. -- From: Tsugikazu Shibata <[EMAIL PROTECTED]> patch 3b6662f192fc521b9657f63e68d20ec99979dae6 upstream. Here is another sync patch of Documentation/ja_JP/HOWTO Japanese developer sent me some cosmetic changes and also follow changes of HOWTO Cross reference URL (sosdg.org/qiyong/lxr) known_regression explanations on kernel dev. process Signed-off-by: Tsugikazu Shibata <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- Documentation/ja_JP/HOWTO | 84 -- 1 file changed, 45 insertions(+), 39 deletions(-) --- a/Documentation/ja_JP/HOWTO +++ b/Documentation/ja_JP/HOWTO @@ -1,4 +1,4 @@ -???NOTE: +NOTE: This is a version of Documentation/HOWTO translated into Japanese. This document is maintained by Tsugikazu Shibata <[EMAIL PROTECTED]> and the JF Project team . @@ -11,14 +11,14 @@ for non English (read: Japanese) speaker fork. So if you have any comments or updates for this file, please try to update the original English file first. -Last Updated: 2007/07/18 +Last Updated: 2007/09/23 == -linux-2.6.22/Documentation/HOWTO +linux-2.6.23/Documentation/HOWTO ?? ??? JF ?? < http://www.linux.or.jp/JF/ > - 2007/07/16 + 2007/09/19 Tsugikazu Shibata ?? (Masanori Kobayasi) @@ -27,6 +27,7 @@ linux-2.6.22/Documentation/HOWTO (Kenji Noguchi) (Takayoshi Kochi) (iwamoto) + (Satoshi Uchida) == Linux ?? @@ -40,7 +41,7 @@ Linux ?? ??? ? - +? - @@ -59,7 +60,7 @@ Linux ?? ? ??()???(??: ??)??? ??C -?? +?? - "The C Programming Language" by Kernighan and Ritchie [Prentice Hall] -2??(B.W. ???/D.M. ??? ???) [] - "Practical C Programming" by Steve Oualline [O'Reilly] @@ -76,7 +77,7 @@ Linux ?? ?? C ? ? ?gcc ??? info ?( info gcc )??? -? +? ? ??? @@ -92,7 +93,7 @@ Linux ?? Linux GPL ??? ? -?COPYING +?COPYING ??Linux Kernel ?? ? @@ -109,7 +110,8 @@ Linux ?? ?? ? ? -?? [EMAIL PROTECTED] ? +?? [EMAIL PROTECTED] ???
[patch 06/13] SLUB: Fix memory leak by not reusing cpu_slab
-stable review patch. If anyone has any objections, please let us know. -- From: Christoph Lameter <[EMAIL PROTECTED]> patch 05aa345034de6ae9c77fb93f6a796013641d57d5 in mainline. SLUB: Fix memory leak by not reusing cpu_slab Fix the memory leak that may occur when we attempt to reuse a cpu_slab that was allocated while we reenabled interrupts in order to be able to grow a slab cache. The per cpu freelist may contain objects and in that situation we may overwrite the per cpu freelist pointer loosing objects. This only occurs if we find that the concurrently allocated slab fits our allocation needs. If we simply always deactivate the slab then the freelist will be properly reintegrated and the memory leak will go away. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Cc: Hugh Dickins <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- mm/slub.c | 22 +- 1 file changed, 1 insertion(+), 21 deletions(-) --- a/mm/slub.c +++ b/mm/slub.c @@ -1501,28 +1501,8 @@ new_slab: page = new_slab(s, gfpflags, node); if (page) { cpu = smp_processor_id(); - if (s->cpu_slab[cpu]) { - /* -* Someone else populated the cpu_slab while we -* enabled interrupts, or we have gotten scheduled -* on another cpu. The page may not be on the -* requested node even if __GFP_THISNODE was -* specified. So we need to recheck. -*/ - if (node == -1 || - page_to_nid(s->cpu_slab[cpu]) == node) { - /* -* Current cpuslab is acceptable and we -* want the current one since its cache hot -*/ - discard_slab(s, page); - page = s->cpu_slab[cpu]; - slab_lock(page); - goto load_freelist; - } - /* New slab does not fit our expectations */ + if (s->cpu_slab[cpu]) flush_slab(s, s->cpu_slab[cpu], cpu); - } slab_lock(page); SetSlabFrozen(page); s->cpu_slab[cpu] = page; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 02/13] locks: fix possible infinite loop in posix deadlock detection
-stable review patch. If anyone has any objections, please let us know. -- From: J. Bruce Fields <[EMAIL PROTECTED]> patch 97855b49b6bac0bd25f16b017883634d13591d00 in mainline. It's currently possible to send posix_locks_deadlock() into an infinite loop (under the BKL). For now, fix this just by bailing out after a few iterations. We may want to fix this in a way that better clarifies the semantics of deadlock detection. But that will take more time, and this minimal fix is probably adequate for any realistic scenario, and is simple enough to be appropriate for applying to stable kernels now. Thanks to George Davis for reporting the problem. Cc: "George G. Davis" <[EMAIL PROTECTED]> Signed-off-by: J. Bruce Fields <[EMAIL PROTECTED]> Acked-by: Alan Cox <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- fs/locks.c | 11 +++ 1 file changed, 11 insertions(+) --- a/fs/locks.c +++ b/fs/locks.c @@ -694,11 +694,20 @@ EXPORT_SYMBOL(posix_test_lock); * Note: the above assumption may not be true when handling lock requests * from a broken NFS client. But broken NFS clients have a lot more to * worry about than proper deadlock detection anyway... --okir + * + * However, the failure of this assumption (also possible in the case of + * multiple tasks sharing the same open file table) also means there's no + * guarantee that the loop below will terminate. As a hack, we give up + * after a few iterations. */ + +#define MAX_DEADLK_ITERATIONS 10 + static int posix_locks_deadlock(struct file_lock *caller_fl, struct file_lock *block_fl) { struct list_head *tmp; + int i = 0; next_task: if (posix_same_owner(caller_fl, block_fl)) @@ -706,6 +715,8 @@ next_task: list_for_each(tmp, _list) { struct file_lock *fl = list_entry(tmp, struct file_lock, fl_link); if (posix_same_owner(fl, block_fl)) { + if (i++ > MAX_DEADLK_ITERATIONS) + return 0; fl = fl->fl_next; block_fl = fl; goto next_task; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 01/13] lockdep: fix mismatched lockdep_depth/curr_chain_hash
-stable review patch. If anyone has any objections, please let us know. -- From: Gregory Haskins <[EMAIL PROTECTED]> patch 3aa416b07f0adf01c090baab26fb70c35ec17623 in mainline. It is possible for the current->curr_chain_key to become inconsistent with the current index if the chain fails to validate. The end result is that future lock_acquire() operations may inadvertently fail to find a hit in the cache resulting in a new node being added to the graph for every acquire. Signed-off-by: Gregory Haskins <[EMAIL PROTECTED]> Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> Cc: Chuck Ebbert <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- kernel/lockdep.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) --- a/kernel/lockdep.c +++ b/kernel/lockdep.c @@ -1521,7 +1521,7 @@ cache_hit: } static int validate_chain(struct task_struct *curr, struct lockdep_map *lock, - struct held_lock *hlock, int chain_head) + struct held_lock *hlock, int chain_head, u64 chain_key) { /* * Trylock needs to maintain the stack of held locks, but it @@ -1534,7 +1534,7 @@ static int validate_chain(struct task_st * graph_lock for us) */ if (!hlock->trylock && (hlock->check == 2) && - lookup_chain_cache(curr->curr_chain_key, hlock->class)) { + lookup_chain_cache(chain_key, hlock->class)) { /* * Check whether last held lock: * @@ -1576,7 +1576,7 @@ static int validate_chain(struct task_st #else static inline int validate_chain(struct task_struct *curr, struct lockdep_map *lock, struct held_lock *hlock, - int chain_head) + int chain_head, u64 chain_key) { return 1; } @@ -2450,11 +2450,11 @@ static int __lock_acquire(struct lockdep chain_head = 1; } chain_key = iterate_chain_key(chain_key, id); - curr->curr_chain_key = chain_key; - if (!validate_chain(curr, lock, hlock, chain_head)) + if (!validate_chain(curr, lock, hlock, chain_head, chain_key)) return 0; + curr->curr_chain_key = chain_key; curr->lockdep_depth++; check_chain_key(curr); #ifdef CONFIG_DEBUG_LOCKDEP -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 00/13] 2.6.23-stable review, core kernel changes
This is the start of the stable review cycle for the 2.6.23.X release. There are 13 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let us know. If anyone is a maintainer of the proper subsystem, and wants to add a Signed-off-by: line to the patch, please respond with it. These patches are sent out with a number of different people on the Cc: line. If you wish to be a reviewer, please email [EMAIL PROTECTED] to add your name to the list. If you want to be off the reviewer list, also email us. Responses should be made by Friday 00:00:00 UTC. Anything received after that time might be too late. This set of patches focuses on only the core kernel. Other sets of patches will follow if you are interested in those instead. The diffstat of this review series is included below. thanks, greg k-h --- Documentation/ja_JP/HOWTO | 84 -- block/ll_rw_blk.c |8 +--- fs/locks.c| 11 ++ fs/proc/array.c |6 ++- fs/splice.c |2 - include/linux/blkdev.h|2 - include/linux/sched.h |1 kernel/fork.c |2 + kernel/futex_compat.c | 27 ++ kernel/lockdep.c | 10 ++--- kernel/params.c |8 +++- kernel/softlockup.c |7 ++- mm/filemap.c | 13 +-- mm/page-writeback.c |4 +- mm/shmem.c| 15 mm/slub.c | 22 16 files changed, 125 insertions(+), 97 deletions(-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [alsa-devel] [BUG] New Kernel Bugs
On 15-11-07 05:16, Bron Gondwana wrote: Totally unrelated - I sent something to the kolab mailing list a couple [ ... ] I'm sure if I had something that I considered worth informing the ALSA project of, I'd be wary of spending the same effort writing a good post knowing it may be dropped in between the by a list moderator just selecing all and bouncing them. Totally unrelated indeed so why are spouting crap? If the kohab list has a problem take it up with them but keep ALSA out of it. alsa-devel has only ever moderated out spam -- nothing else. ene - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: User Mode Linux still broken in 2.6.23.1
On Wed, Nov 14, 2007 at 11:58:15PM -0600, Rob Landley wrote: > On Wednesday 14 November 2007 12:54:44 Greg KH wrote: > > On Sat, Nov 03, 2007 at 11:51:50PM -0500, Rob Landley wrote: > > > Building with the attached .config on x86-64, it does this: > > > > > > CC arch/um/kernel/smp.o > > > In file included from include/asm/arch/tlb.h:11, > > > from include/asm/tlb.h:4, > > > from arch/um/kernel/smp.c:8: > > > include/asm-generic/tlb.h: In function ???tlb_flush_mmu???: > > > include/asm-generic/tlb.h:76: error: implicit declaration of function > > > ???release_pages??? include/asm-generic/tlb.h: In function > > > ???tlb_remove_page???: > > > include/asm-generic/tlb.h:105: error: implicit declaration of function > > > ???page_cache_release??? make[1]: *** [arch/um/kernel/smp.o] Error 1 > > > make: *** [arch/um/kernel] Error 2 > > > > > > I've been doing the following to fix it. I know it's not the right fix, > > > (see the earlier thread about it at http://lkml.org/lkml/2007/8/24/441 ) > > > but could the one line fix go into the -stable queue 2.6.23 while a > > > proper fix goes into 2.6.24? > > > > I think the patches that I have just added to the stable queue for > > 2.6.23.2 will fix this. If not, please let me know after testing. > > Where do I find these patches to test? I know where to find the stable > releases, but not the "stable queue". > > Documentation/stable_kernel_rules.txt just says there _is_ a stable queue, > not > where to access it. Google's first hit for "linux stable queue" was > http://git.kernel.org/?p=linux/kernel/git/chrisw/stable-queue.git;a=shortlog > which apparently stopped updating in march... > > Happy to test the patch you mentioned, if I can figure out where to find it... It's at: http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=summary Chris and I used to have separate queues, but that got messy, I suppose we should just delete those old copies... thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
warning, huge 2.6.23-stable review cycle about to start
Ok, I've been slacking on the -stable front for a bit here, and didn't realize how far behind I've gotten. Everyone has been sending patches in, which is great, but now we are facing a HUGE 114 patch release. As there's no real way that everyone can review all of these patches, I've decided to split them up into 6 different categories, and will be sending patches out in these categories for review. If people can just glance over the ones in the areas they care about, I would really appreciate it. The 6 categories are: - core kernel changes - arch specific changes - networking changes - network driver changes - non-network driver changes - filesystems Consider this a warning that your inbox is going to be filled up soon, so everyone can prepare those filters :) thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: User Mode Linux still broken in 2.6.23.1
On Wednesday 14 November 2007 12:54:44 Greg KH wrote: > On Sat, Nov 03, 2007 at 11:51:50PM -0500, Rob Landley wrote: > > Building with the attached .config on x86-64, it does this: > > > > CC arch/um/kernel/smp.o > > In file included from include/asm/arch/tlb.h:11, > > from include/asm/tlb.h:4, > > from arch/um/kernel/smp.c:8: > > include/asm-generic/tlb.h: In function ???tlb_flush_mmu???: > > include/asm-generic/tlb.h:76: error: implicit declaration of function > > ???release_pages??? include/asm-generic/tlb.h: In function > > ???tlb_remove_page???: > > include/asm-generic/tlb.h:105: error: implicit declaration of function > > ???page_cache_release??? make[1]: *** [arch/um/kernel/smp.o] Error 1 > > make: *** [arch/um/kernel] Error 2 > > > > I've been doing the following to fix it. I know it's not the right fix, > > (see the earlier thread about it at http://lkml.org/lkml/2007/8/24/441 ) > > but could the one line fix go into the -stable queue 2.6.23 while a > > proper fix goes into 2.6.24? > > I think the patches that I have just added to the stable queue for > 2.6.23.2 will fix this. If not, please let me know after testing. Where do I find these patches to test? I know where to find the stable releases, but not the "stable queue". Documentation/stable_kernel_rules.txt just says there _is_ a stable queue, not where to access it. Google's first hit for "linux stable queue" was http://git.kernel.org/?p=linux/kernel/git/chrisw/stable-queue.git;a=shortlog which apparently stopped updating in march... Happy to test the patch you mentioned, if I can figure out where to find it... Thanks, Rob -- "One of my most productive days was throwing away 1000 lines of code." - Ken Thompson. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] New Kernel Bugs
On Wed, 14 Nov 2007, Linus Torvalds wrote: > > So even at 100% dirty limits, it won't let you dirty more than 1GB on the > default 32-bit setup. Side note: all of these are obviously still just heuristics. If you really *do* run on a 32-bit kernel, and you want to have the pain, I'm sure you can just disable the dirty limits with a one-liner kernel mod. And if it's useful enough, we can certainly expose flags like that.. Not that I expect that much anybody else will ever care, but it's not like it's wrong to expose the silly heuristics the kernel has to users that have very specific loads. That said, I still do hope you aren't actually using HIGHMEM64G. I was really hoping that the people who had enough moolah to buy >4GB of RAM had long since also upgraded to a 64-bit machine ;) Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 5/8] Immediate Values - x86 Optimization
* Rusty Russell ([EMAIL PROTECTED]) wrote: > On Thursday 15 November 2007 15:06:10 Mathieu Desnoyers wrote: > > * Rusty Russell ([EMAIL PROTECTED]) wrote: > > > A stop_machine (or lightweight variant using IPI) would be sufficient and > > > vastly simpler. Trying to patch NMI handlers while they're running is > > > already crazy. > > > > I wouldn't mind if it was limited to the code within do_nmi(), but then > > we would have to accept potential GPF if > > > > A - the NMI or MCE code calls any external kernel code (printk, > > notify_die, spin_lock/unlock, die_nmi, lapic_wd_event (perfctr code, > > calls printk too for debugging)... > > Sure, but as I pointed out previously, such calls are already best effort. > You can do very little safely from do_nmi(), and calling printk isn't one of > them, yes and no.. do_nmi uses the "bust spinlocks" exactly for this. So this is ok by design. Other than this, we can end up mixing up the console data output with different sources of characters, but I doubt something really bad can happen (like a deadlock). > nor is grabbing a spinlock (well, actually you could as long as it's > *only* used by NMI handlers. See any of those?). Yup, see arch/x86/kernel/nmi_64.c : nmi_watchdog_tick() It defines a spinlock to "Serialise the printks". I guess it's good to protect against other nmi watchdogs running on other CPUs concurrently, I guess. > > > Therefore, if one decides to use the immediate values to > > leave dormant spinlock instrumentation in the kernel, I wouldn't want it > > to have undesirable side-effects (GPF) when the instrumentation is > > being enabled, as rare as it could be. > > It's overengineered, since it's less likely than deadlock already. > So should we put a warning telling "enabling tracing or profiling on a production system that also uses NMI watchdog could potentially cause a crash" ? The rarer a bug is, the more difficult it is to debug. It does not make the bug hurt less when it happens. The normal thing to do when a potential deadlock is detected is to fix it, not to leave it there under the premise that it doesn't matter since it happens rarely. In our case, where we know there is a potential race, I don't see any reason not to make sure it never happens. What's the cost of it ? arch/x86/kernel/immediate.o : 2.4K let's compare.. kernel/stop_machine.o : 3.9K so I think that code size is not an issue there, especially since the immediate values are not meant to be deployed on embedded systems. > > > I'd keep this version up your sleeve for they day when it's needed. > > > > If we choose to go this way, stop_machine would have to do a sync_core() > > on every CPU before it reactivates interrupts for this to respect > > Intel's errata. > > Yes, I don't think stop_machine is actually what you want anyway, since you > are happy to run in interrupt context. An IPI-based scheme is probably > better, and also has the side effect of iret doing the sync you need, IIUC. > Yup, looping in IPIs with interrupts disabled should do the job there. It's just awful for interrupt latency on large SMP systems :( Being currently bad at it is not a reason to make it worse. If we have a CPU that is within a high latency irq disable region when we send the IPI, we can easily end up waiting for this critical section to end with interrupts disabled on all CPUs. The fact that we would wait for the longest interrupt disable region with IRQs disabled implies that we increase the maximum latency of the system, by design. I'm not sure I would like to be the new longest record-beating IRQ off region. Mathieu > Hope that clarifies, > Rusty. -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc2-mm1 (memory hotplug x86_64/vmemmap fix)
On Thu, Nov 15, 2007 at 01:29:19PM +0900, KAMEZAWA Hiroyuki wrote: > Fixes for memory hotplug compile and .section handling. > > This patch fixes following bugs > == > WARNING: vmlinux.o(.text+0x1d07c): Section mismatch: reference to .init.text:f > ind_e820_area (between 'init_memory_mapping' and 'arch_add_memory') > WARNING: vmlinux.o(.text+0x946b5): Section mismatch: reference to .init.text: > __alloc_bootmem_node (between 'vmemmap_alloc_block' and > 'vmemmap_pgd_populate') > > ERROR: "memory_add_physaddr_to_nid" [drivers/acpi/acpi_memhotplug.ko] > undefined! > make[1]: *** [__modpost > == > > This patch does > 1. export memory_add_physaddr_to_nid(). > 2. changes __init to __init_refok find_early_table_space() (x86/mm/init_64.c) > 3. changes __init_refok to __meminit in mm/sparse.c (This is bug.) > 4. add wrapper function to call bootmem allocator without warning. > > After seeing "3", I thought simple __init_refok is dangerous and decided to > add wrapper function to call bootmem, is this style acceptable ? Hi KAMEZAWA, Thanks for the patch, it resolves memory_add_physaddr_to_nid() build error for me. Tested-by: Kamalesh Babulal <[EMAIL PROTECTED]> Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> arch/x86/mm/init_64.c |2 +- arch/x86/mm/srat_64.c |1 + mm/sparse-vmemmap.c | 13 - mm/sparse.c | 12 ++-- 4 files changed, 24 insertions(+), 4 deletions(-) === --- linux-2.6.24-rc2-mm1.orig/arch/x86/mm/srat_64.c +++ linux-2.6.24-rc2-mm1/arch/x86/mm/srat_64.c @@ -562,3 +562,4 @@ int memory_add_physaddr_to_nid(u64 start return ret; } +EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid); Index: linux-2.6.24-rc2-mm1/arch/x86/mm/init_64.c === --- linux-2.6.24-rc2-mm1.orig/arch/x86/mm/init_64.c +++ linux-2.6.24-rc2-mm1/arch/x86/mm/init_64.c @@ -319,7 +319,7 @@ static void __meminit phys_pud_init(pud_ __flush_tlb(); } -static void __init find_early_table_space(unsigned long end) +static void __init_refok find_early_table_space(unsigned long end) { unsigned long puds, pmds, tables, start; Index: linux-2.6.24-rc2-mm1/mm/sparse.c === --- linux-2.6.24-rc2-mm1.orig/mm/sparse.c +++ linux-2.6.24-rc2-mm1/mm/sparse.c @@ -55,7 +55,15 @@ static inline void set_section_nid(unsig #endif #ifdef CONFIG_SPARSEMEM_EXTREME -static struct mem_section noinline __init_refok *sparse_index_alloc(int nid) +/* + * for avoiding section mismatch. + */ +static void __init_refok *__call_bootmem_alloc(int nid, int array_size) +{ + return alloc_bootmem_node(NODE_DATA(nid), array_size); +} + +static struct mem_section noinline __meminit *sparse_index_alloc(int nid) { struct mem_section *section = NULL; unsigned long array_size = SECTIONS_PER_ROOT * @@ -64,7 +72,7 @@ static struct mem_section noinline __ini if (slab_is_available()) section = kmalloc_node(array_size, GFP_KERNEL, nid); else - section = alloc_bootmem_node(NODE_DATA(nid), array_size); + section = __call_bootmem_alloc(nid, array_size); if (section) memset(section, 0, array_size); Index: linux-2.6.24-rc2-mm1/mm/sparse-vmemmap.c === --- linux-2.6.24-rc2-mm1.orig/mm/sparse-vmemmap.c +++ linux-2.6.24-rc2-mm1/mm/sparse-vmemmap.c @@ -30,6 +30,17 @@ #include /* + * wrapper for calling bootmem alloc from __meminit code. + */ +void __init_refok *__call_alloc_bootmem(int node, + int size, int align, int goal) +{ + return __alloc_bootmem_node(NODE_DATA(node), size, align, goal); +} + + + +/* * Allocate a block of memory to be used to back the virtual memory map * or to back the page tables that are used to create the mapping. * Uses the main allocators if they are available, else bootmem. @@ -44,7 +55,7 @@ void * __meminit vmemmap_alloc_block(uns return page_address(page); return NULL; } else - return __alloc_bootmem_node(NODE_DATA(node), size, size, + return __call_alloc_bootmem(node, size, size, __pa(MAX_DMA_ADDRESS)); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] New Kernel Bugs
On Thu, 15 Nov 2007, Bron Gondwana wrote: > > So we've already been running those settings for a while. They didn't > help. Ok, so something else is up. If the mmap file is 2G, and you have 6G of RAM, you shouldn't be hitting the dirty limits with those setups. Of course, it may still be that some accounting thing is simply off, and the dirty limits trigger *despite* all the proper config settings ;) > Guess we'd better get on to figuring building a simple test app. Yeah, if you have something that others can see in action, that is sure going to get more people to look at it. That said - I'm sincerely hoping that you're not running on a 32-bit kernel. Because if so, those percentages are percentages of *normal* memory, not highmem (that got changed at one point after people ran out of lowmem). So even at 100% dirty limits, it won't let you dirty more than 1GB on the default 32-bit setup. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86 32-bit machine check handler
> I found patch from about three years ago that implemented a 32-bit > version of the x86_64 machine check handler. Do you know of any newer > attempts? No. > However, given the merge of x86, a single implementation should be able > to handle both the 32-bit and 64-bit cases. I tried to build the 64-bit > machine check handler (mce_64.c) for 32-bit to see what kind problems it > would run into. So far I found a few things: > - there is no idle_notifier_register in 32-bit x86 There used to be one, just needs to be readded. > - there is no oops_begin in 32-bit x86 > - register names are different (rip, cs) regs->rip -> instruction_pointer() ->cs just needs a similar macro > So it looks like giving 32-bit x86 the same machine check support as in > 64-bit is both feasible and desirable. Yep. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] New Kernel Bugs
On Wed, Nov 14, 2007 at 08:24:53PM -0800, Linus Torvalds wrote: > > > On Thu, 15 Nov 2007, Bron Gondwana wrote: > > > > And congratulations to him for that. We almost entirely dropped 2.6.16, > > but there's a regression some time since then that makes large MMAPed > > files a major pain (specifically the dcc database clean takes about 5 > > minutes on 2.6.16 and about 12 hours on 2.6.20 or 2.6.23 series kernels) > > > > But we keep putting off writing a small testcase that can repeat the > > issue so we can bisect it - because it's working fine with 2.6.16 on > > that machine. > > Heh. I suspect you don't even need to bisect it. > > The big difference with large mmap'ed files is that later kernels will > actually track dirty ratios for dirty mmap'ed pages. Earlier kernels never > did. > > So in older kernels, you can dirty as much memory as you want, and the > kernel will never try to write it back (well - "never" here means one of > either (a) you ask it to with msync or (b) you run out of memory, when the > kernel then totally falls down and the machine is essentially unusuable). > > So *if* the symptom seems to be that the later kernels do a lot more IO, > then try to change > > /proc/sys/vm/dirty_[background_]ratio > > which is just a percentage of memory (defaults to 5% for background and > 10% for foreground dirtying). Turn them both up a lot (say to 50 and 80 > percent respectively) and see if that makes a difference. >From our sysctl.conf: # This should help reduce flushing on Cache::FastMmap files vm.dirty_background_ratio = 50 vm.dirty_expire_centisecs = 9000 vm.dirty_ratio = 80 vm.dirty_writeback_centisecs = 3000 So we've already been running those settings for a while. They didn't help. We also gave this thing its very own dedicated ServeRAID card and associated RAID1 set of high speed SCSI drives (mainly because they were just sitting there already attached to the machine and unused, we don't love DCC that much) and it didn't help. Helped the rest of the machine now that the system drive wasn't being pegged 100% for 12 hours a day, but it didn't speed things up any. It was making some pretty random little scattered changes all through that file. Hmm.. here's what the developers said about it: First dbclean creates a new dcc_db file by copying from the old file. As it copies, it decides whether each record is worth keeping. That involves looking up the checksums in the old hash table. This is as almost afast a simple /bin/cp if the old dcc_db and dcc_db.hash files fit in RAM. The dbclean creates a new dcc_db.hash file. This starts with creating an empty new dcc_db.hash file. Then the new dcc_db and dcc_db.hash files are mapped into memory, and dbclean creates pointers to each checksum in the dcc_db file in the dcc_db.hash file. While dbclean is running, dccd unmaps everything and tries to stay out of the way. > If so, you'll be the first one to officially even notice this change, I > think. Yay for us. Thankfully it doesn't affect Cyrus's MMAP usage (read only with direct seek and write calls to change anything, then remap) or we would have suffered pretty badly! Guess we'd better get on to figuring building a simple test app. The mmap file that DCC uses is about 2Gb if that makes any difference: -rw-r--r-- 1 dcc dcc 2035138560 Nov 15 00:15 dcc_db -rw-r--r-- 1 dcc dcc 516612096 Nov 14 06:27 dcc_db.hash The machine has 6Gb of memory and should be able to fit these files fine: [EMAIL PROTECTED] hm]$ free total used free sharedbuffers cached Mem: 62323645758112 474252 0 417563002528 -/+ buffers/cache:27138283518536 Swap: 2048248 749441973304 And here's what top says about the process: 15 0 1914m 57m 41m D5 1.0 346:07.79 dccd This is on: 2.6.16.55-reiserfix-fai (one small patch to reiserfs, and built with netboot support for FAI) So yeah - we'll try to get a clearer idea of what it's doing, but the knob twiddle didn't work for us. Bron. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [stable] [PATCH 000 of 2] md: Fixes for md in 2.6.23
On Tuesday November 13, [EMAIL PROTECTED] wrote: > > raid5-fix-unending-write-sequence.patch is in -mm and I believe is > waiting on an Acked-by from Neil? > It seems to have just been sent on to Linus, so it probably will go in without: Acked-By: NeilBrown <[EMAIL PROTECTED]> I'm beginning to think that I really should sit down and make sure I understand exactly how those STRIPE_OP_ flags are uses. They generally make sense but there seem to be a number of corner cases where they aren't quite handled properly.. Maybe they are all found now, or maybe NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [perfmon] Re: [perfmon2] perfmon2 merge news
On Thu, 15 Nov 2007, Paul Mackerras wrote: > dean gaudet writes: > > > actually multiplexing is the main feature i am in need of. there are an > > insufficient number of counters (even on k8 with 4 counters) to do > > complete stall accounting or to get a general overview of L1d/L1i/L2 cache > > hit rates, average miss latency, time spent in various stalls, and the > > memory system utilization (or HT bus utilization). this runs out to > > something like 30 events which are interesting... and re-running a > > benchmark over and over just to get around the lack of multiplexing is a > > royal pain in the ass. > > So by "multiplexing" do you mean the ability to have multiple event > sets associated with a context and have the kernel switch between them > automatically? yep. -dean - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [perfmon] Re: [perfmon2] perfmon2 merge news
dean gaudet writes: > actually multiplexing is the main feature i am in need of. there are an > insufficient number of counters (even on k8 with 4 counters) to do > complete stall accounting or to get a general overview of L1d/L1i/L2 cache > hit rates, average miss latency, time spent in various stalls, and the > memory system utilization (or HT bus utilization). this runs out to > something like 30 events which are interesting... and re-running a > benchmark over and over just to get around the lack of multiplexing is a > royal pain in the ass. So by "multiplexing" do you mean the ability to have multiple event sets associated with a context and have the kernel switch between them automatically? Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 5/8] Immediate Values - x86 Optimization
On Thursday 15 November 2007 15:06:10 Mathieu Desnoyers wrote: > * Rusty Russell ([EMAIL PROTECTED]) wrote: > > A stop_machine (or lightweight variant using IPI) would be sufficient and > > vastly simpler. Trying to patch NMI handlers while they're running is > > already crazy. > > I wouldn't mind if it was limited to the code within do_nmi(), but then > we would have to accept potential GPF if > > A - the NMI or MCE code calls any external kernel code (printk, > notify_die, spin_lock/unlock, die_nmi, lapic_wd_event (perfctr code, > calls printk too for debugging)... Sure, but as I pointed out previously, such calls are already best effort. You can do very little safely from do_nmi(), and calling printk isn't one of them, nor is grabbing a spinlock (well, actually you could as long as it's *only* used by NMI handlers. See any of those?). > Therefore, if one decides to use the immediate values to > leave dormant spinlock instrumentation in the kernel, I wouldn't want it > to have undesirable side-effects (GPF) when the instrumentation is > being enabled, as rare as it could be. It's overengineered, since it's less likely than deadlock already. > > I'd keep this version up your sleeve for they day when it's needed. > > If we choose to go this way, stop_machine would have to do a sync_core() > on every CPU before it reactivates interrupts for this to respect > Intel's errata. Yes, I don't think stop_machine is actually what you want anyway, since you are happy to run in interrupt context. An IPI-based scheme is probably better, and also has the side effect of iret doing the sync you need, IIUC. Hope that clarifies, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] New Kernel Bugs
On Tue, Nov 13, 2007 at 10:56:01PM +0100, Christian Kujau wrote: > On Tue, 13 Nov 2007, Andrew Morton wrote: >> There are a number of process things we _could_ do. Like >> - have bugfix-only kernel releases > > Adrian Bunk does (did?) this with 2.6.16.x, although it always seemed to me > like an unrewarded one man show. AFAIK not even the big distros are begging > for bugfix-only versions, as they too want to have (sell) new features. > Mission critical systems might want to require such versions, but I guess > they're using heavily customized trees anyway. And congratulations to him for that. We almost entirely dropped 2.6.16, but there's a regression some time since then that makes large MMAPed files a major pain (specifically the dcc database clean takes about 5 minutes on 2.6.16 and about 12 hours on 2.6.20 or 2.6.23 series kernels) But we keep putting off writing a small testcase that can repeat the issue so we can bisect it - because it's working fine with 2.6.16 on that machine. Bron. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc2-mm1 (memory hotplug x86_64/vmemmap fix)
Fixes for memory hotplug compile and .section handling. This patch fixes following bugs == WARNING: vmlinux.o(.text+0x1d07c): Section mismatch: reference to .init.text:f ind_e820_area (between 'init_memory_mapping' and 'arch_add_memory') WARNING: vmlinux.o(.text+0x946b5): Section mismatch: reference to .init.text: __alloc_bootmem_node (between 'vmemmap_alloc_block' and 'vmemmap_pgd_populate') ERROR: "memory_add_physaddr_to_nid" [drivers/acpi/acpi_memhotplug.ko] undefined! make[1]: *** [__modpost == This patch does 1. export memory_add_physaddr_to_nid(). 2. changes __init to __init_refok find_early_table_space() (x86/mm/init_64.c) 3. changes __init_refok to __meminit in mm/sparse.c (This is bug.) 4. add wrapper function to call bootmem allocator without warning. After seeing "3", I thought simple __init_refok is dangerous and decided to add wrapper function to call bootmem, is this style acceptable ? Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> arch/x86/mm/init_64.c |2 +- arch/x86/mm/srat_64.c |1 + mm/sparse-vmemmap.c | 13 - mm/sparse.c | 12 ++-- 4 files changed, 24 insertions(+), 4 deletions(-) Index: linux-2.6.24-rc2-mm1/arch/x86/mm/srat_64.c === --- linux-2.6.24-rc2-mm1.orig/arch/x86/mm/srat_64.c +++ linux-2.6.24-rc2-mm1/arch/x86/mm/srat_64.c @@ -562,3 +562,4 @@ int memory_add_physaddr_to_nid(u64 start return ret; } +EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid); Index: linux-2.6.24-rc2-mm1/arch/x86/mm/init_64.c === --- linux-2.6.24-rc2-mm1.orig/arch/x86/mm/init_64.c +++ linux-2.6.24-rc2-mm1/arch/x86/mm/init_64.c @@ -319,7 +319,7 @@ static void __meminit phys_pud_init(pud_ __flush_tlb(); } -static void __init find_early_table_space(unsigned long end) +static void __init_refok find_early_table_space(unsigned long end) { unsigned long puds, pmds, tables, start; Index: linux-2.6.24-rc2-mm1/mm/sparse.c === --- linux-2.6.24-rc2-mm1.orig/mm/sparse.c +++ linux-2.6.24-rc2-mm1/mm/sparse.c @@ -55,7 +55,15 @@ static inline void set_section_nid(unsig #endif #ifdef CONFIG_SPARSEMEM_EXTREME -static struct mem_section noinline __init_refok *sparse_index_alloc(int nid) +/* + * for avoiding section mismatch. + */ +static void __init_refok *__call_bootmem_alloc(int nid, int array_size) +{ + return alloc_bootmem_node(NODE_DATA(nid), array_size); +} + +static struct mem_section noinline __meminit *sparse_index_alloc(int nid) { struct mem_section *section = NULL; unsigned long array_size = SECTIONS_PER_ROOT * @@ -64,7 +72,7 @@ static struct mem_section noinline __ini if (slab_is_available()) section = kmalloc_node(array_size, GFP_KERNEL, nid); else - section = alloc_bootmem_node(NODE_DATA(nid), array_size); + section = __call_bootmem_alloc(nid, array_size); if (section) memset(section, 0, array_size); Index: linux-2.6.24-rc2-mm1/mm/sparse-vmemmap.c === --- linux-2.6.24-rc2-mm1.orig/mm/sparse-vmemmap.c +++ linux-2.6.24-rc2-mm1/mm/sparse-vmemmap.c @@ -30,6 +30,17 @@ #include /* + * wrapper for calling bootmem alloc from __meminit code. + */ +void __init_refok *__call_alloc_bootmem(int node, + int size, int align, int goal) +{ + return __alloc_bootmem_node(NODE_DATA(node), size, align, goal); +} + + + +/* * Allocate a block of memory to be used to back the virtual memory map * or to back the page tables that are used to create the mapping. * Uses the main allocators if they are available, else bootmem. @@ -44,7 +55,7 @@ void * __meminit vmemmap_alloc_block(uns return page_address(page); return NULL; } else - return __alloc_bootmem_node(NODE_DATA(node), size, size, + return __call_alloc_bootmem(node, size, size, __pa(MAX_DMA_ADDRESS)); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] New Kernel Bugs
On Thu, 15 Nov 2007, Bron Gondwana wrote: > > And congratulations to him for that. We almost entirely dropped 2.6.16, > but there's a regression some time since then that makes large MMAPed > files a major pain (specifically the dcc database clean takes about 5 > minutes on 2.6.16 and about 12 hours on 2.6.20 or 2.6.23 series kernels) > > But we keep putting off writing a small testcase that can repeat the > issue so we can bisect it - because it's working fine with 2.6.16 on > that machine. Heh. I suspect you don't even need to bisect it. The big difference with large mmap'ed files is that later kernels will actually track dirty ratios for dirty mmap'ed pages. Earlier kernels never did. So in older kernels, you can dirty as much memory as you want, and the kernel will never try to write it back (well - "never" here means one of either (a) you ask it to with msync or (b) you run out of memory, when the kernel then totally falls down and the machine is essentially unusuable). So *if* the symptom seems to be that the later kernels do a lot more IO, then try to change /proc/sys/vm/dirty_[background_]ratio which is just a percentage of memory (defaults to 5% for background and 10% for foreground dirtying). Turn them both up a lot (say to 50 and 80 percent respectively) and see if that makes a difference. If so, you'll be the first one to officially even notice this change, I think. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [perfmon] Re: [perfmon2] perfmon2 merge news
On Wed, 14 Nov 2007, Andi Kleen wrote: > Later a syscall might be needed with event multiplexing, but that seems > more like a far away non essential feature. actually multiplexing is the main feature i am in need of. there are an insufficient number of counters (even on k8 with 4 counters) to do complete stall accounting or to get a general overview of L1d/L1i/L2 cache hit rates, average miss latency, time spent in various stalls, and the memory system utilization (or HT bus utilization). this runs out to something like 30 events which are interesting... and re-running a benchmark over and over just to get around the lack of multiplexing is a royal pain in the ass. it's not a "far away non-essential feature" to me. it's something i would use daily if i had all the pieces together now (and i'm constrained because i cannot add an out-of-tree patch which adds unofficial syscalls to the kernel i use). -dean - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [alsa-devel] [BUG] New Kernel Bugs
On Wed, Nov 14, 2007 at 12:46:24PM +0100, Rene Herman wrote: > On 14-11-07 11:07, David Miller wrote: > > Added Jaroslav and Takashi to the already extensive CC > >> From: Russell King <[EMAIL PROTECTED]> > >>> So, when are you creating a replacement alsa-devel mailing list on >>> vger? That's also subscribers-only. >> The operative term is "alternative" rather than "replacement". >> Perhaps this misunderstanding is what you're so upset about. >> And yes, that alsa list bugs the crap out of me too. I'm more than >> happy to provide an alternative for that one as well. > > [EMAIL PROTECTED] is not subscriber-only. Same as that arm list, > it's _moderated_ for non-subscribers and given that I and other moderators > have been doing our best to moderate quickly (I tend to stay logged in to > the moderation interface all day for example) what specifically bugged the > crap out of you? It's not something a poster needs to concern himself with. Totally unrelated - I sent something to the kolab mailing list a couple of days ago (it's moderated for non subscribers) informing them that I had found the cause of some Cyrus bugs that they had problems with in the past and providing a link to my post to the cyrus list with the patches attached. It sat in the moderation queue and then was rejected with "non subscriber post to subscription only list". Not only was the reponse a day later when I had moved on to other things, but it got me really pissed off that I had put some effort into providing a good quality post that outlined the specific issues and how they applied to their project, and had been summarily dismissed, probably without the effort being put in. There's no way for a non-subscriber to know in advance if the list they are trying to post to will do that to them, completely negating the effort put in to writing something worthwhile to inform that community. It's insular, and it sucks. So yeah, my attitude now is that the Kolab folks can go screw themselves and track down the fix on their own or wait until I've convinced upstream to accept the fixes (likely) and they have moved to the new version (unlikely for a long time, and meanwhile they're missing out on the performance increases that having a more stable skiplist library would give them) I'm sure if I had something that I considered worth informing the ALSA project of, I'd be wary of spending the same effort writing a good post knowing it may be dropped in between the by a list moderator just selecing all and bouncing them. Bron. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] sata_nv: fix ATAPI issues with memory over 4GB (v3)
Robert Hancock wrote: > Tejun Heo wrote: >> Robert Hancock wrote: >>> This fixes some problems with ATAPI devices on nForce4 controllers in >>> ADMA mode >>> on systems with memory located above 4GB. We need to delay setting >>> the 64-bit >>> DMA mask until the PRD table and padding buffer are allocated so that >>> they don't >>> get allocated above 4GB and break legacy mode (which is needed for ATAPI >>> devices). >>> >>> Signed-off-by: Robert Hancock <[EMAIL PROTECTED]> >> >> applied to #tj-upstream-fixes. >> > > I have a report that these patches crashed but the previous patch worked: > > https://bugzilla.redhat.com/show_bug.cgi?id=351451 > > So there may still be a problem here. Hmmm... The change seemed safe to me. Anyways, dropping the patch for now. Please re-send later. Also, please format patch description such that it fits in 80c. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] sata_nv: fix ATAPI issues with memory over 4GB (v3)
Tejun Heo wrote: Robert Hancock wrote: This fixes some problems with ATAPI devices on nForce4 controllers in ADMA mode on systems with memory located above 4GB. We need to delay setting the 64-bit DMA mask until the PRD table and padding buffer are allocated so that they don't get allocated above 4GB and break legacy mode (which is needed for ATAPI devices). Signed-off-by: Robert Hancock <[EMAIL PROTECTED]> applied to #tj-upstream-fixes. I have a report that these patches crashed but the previous patch worked: https://bugzilla.redhat.com/show_bug.cgi?id=351451 So there may still be a problem here. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ata_sg_setup_one vs ata_sg_setup?
Rusty Russell wrote: > Hi Jeff, > > Was looking through libata, and it seems to me that ata_sg_setup is a > superset of ata_sg_setup_one. Am I missing something? Seems like it could > be simplified. > > My machine never seems to do an ata_sg_setup_one, so this patch isn't really > tested... I have about the same patch queued here which also kills ata_sg_init_one() completely and replaces ATA_QCFLAG_SG/ATA_QCFLAG_SINGLE with ATA_QCFLAG_DMAMAP (now a single flag). I'll compare your version and mine and see if mine is missing something. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 5/8] Immediate Values - x86 Optimization
* Rusty Russell ([EMAIL PROTECTED]) wrote: > On Wednesday 14 November 2007 05:58:05 Mathieu Desnoyers wrote: > > x86 optimization of the immediate values which uses a movl with code > > patching to set/unset the value used to populate the register used as > > variable source. > > For the record, I think the patching code gross overkill. > > A stop_machine (or lightweight variant using IPI) would be sufficient and > vastly simpler. Trying to patch NMI handlers while they're running is > already crazy. > I wouldn't mind if it was limited to the code within do_nmi(), but then we would have to accept potential GPF if A - the NMI or MCE code calls any external kernel code (printk, notify_die, spin_lock/unlock, die_nmi, lapic_wd_event (perfctr code, calls printk too for debugging)... B - we try to patch this code at the wrong moment I could live with that, but I would prefer to have a solid, non flaky solution. My goal is to help the kernel quality _improve_ rather than deteriorate. Therefore, if one decides to use the immediate values to leave dormant spinlock instrumentation in the kernel, I wouldn't want it to have undesirable side-effects (GPF) when the instrumentation is being enabled, as rare as it could be. > I'd keep this version up your sleeve for they day when it's needed. > If we choose to go this way, stop_machine would have to do a sync_core() on every CPU before it reactivates interrupts for this to respect Intel's errata. It's not just a matter of not executing the code while it is modified; the issue here is that we must insure that we don't have an incoherent trace cache. So, as is, stop_machine would not respect the errata. Mathieu > Rusty. -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Question about F_RDLCK and F_WRLCK on alpha
I suppose this can be better analyzed if the cpp output is presented to show exactly how rdlck/wrlck is included/defined . But the only requirement is that the flags are unique. solong as wrlck != rdlck != unlck every is happy. There is no expectation that an i386 binary will run on an alpha machine. there might be an issue if the i386 source code uses "0" or "1" constants instead of the WRLCK/RDLCK. And then compiled on the alpha. Then it would be out of sync. I suppose they are different bec the folks at OSF had it defined that way. And there was some need to run OSF/alpha bins on a linux/alpha ( just a guess on my part ) Oliver Falk wrote: Hi! Can someone explain me, why we have different define's for WRLCK and RDLCK within alpha kernel headers: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] CRISv10 fasttimer: Scrap INLINE and name timeval_cmp better
On Wednesday 14 November 2007 09:08, Jesper Nilsson wrote: > Scrap the local __INLINE__ macro, and rename timeval_cmp to fasttime_cmp. > > Inline macro was completely unnecessary since the macro was defined > locally to be inline. > timeval_cmp was inaccurately named since it does comparison on > struct fasttimer_t and not on struct timeval. > > Signed-off-by: Jesper Nilsson <[EMAIL PROTECTED]> > --- > fasttimer.c | 16 +++- > 1 file changed, 7 insertions(+), 9 deletions(-) > > diff --git a/arch/cris/arch-v10/kernel/fasttimer.c > b/arch/cris/arch-v10/kernel/fasttimer.c index 645d705..c1a3a21 100644 > --- a/arch/cris/arch-v10/kernel/fasttimer.c > +++ b/arch/cris/arch-v10/kernel/fasttimer.c > @@ -46,8 +46,6 @@ static int sanity_failed; > #define D2(x) > #define DP(x) > > -#define __INLINE__ inline > - > static unsigned int fast_timer_running; > static unsigned int fast_timers_added; > static unsigned int fast_timers_started; > @@ -118,13 +116,13 @@ int timer_freq_settings[NUM_TIMER_STATS]; > int timer_delay_settings[NUM_TIMER_STATS]; > > /* Not true gettimeofday, only checks the jiffies (uptime) + useconds */ > -void __INLINE__ do_gettimeofday_fast(struct fasttime_t *tv) > +inline void do_gettimeofday_fast(struct fasttime_t *tv) Why these functions are not "static inline"? Wthout "static", gcc will actually create non-inlined version of them! $ cat t.c inline int f() { return 1; } int g() { return f(); } $ gcc -O2 -c t.c $ nm --size-sort t.o 000a T f <=== !!! 000a T g P.S. whitespace style in fasttimer.c doesn't match rest of the kernel (kernel uses tab, not 2-spaces indentation). Curly braces don't match too: if (t0->tv_sec < t1->tv_sec) { return -1; } should be if (t0->tv_sec < t1->tv_sec) { return -1; } -- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pata_sis.c: Add Packard Bell EasyNote K5305 to laptops
Gabriel C wrote: > Hi, > > With newer kernels HDD in my old laptop is limited to UDMA 33. > With this patch I get UDMA 100 again. > > Signed-off-by: Gabriel Craciunescu <[EMAIL PROTECTED]> applied to #tj-upstream-fixes with Hi, edited out. :-) -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc2-mm1
On Nov 15, 2007 10:38 AM, Kay Sievers <[EMAIL PROTECTED]> wrote: > > On Thu, 2007-11-15 at 09:01 +0800, Dave Young wrote: > > On Nov 15, 2007 5:27 AM, Kay Sievers <[EMAIL PROTECTED]> wrote: > > > On Wed, 2007-11-14 at 20:19 +0100, Jiri Kosina wrote: > > > > On Wed, 14 Nov 2007, Kay Sievers wrote: > > > > > > > > > Could it be an init-order problem, where something tries to use the > > > > > block subsystem? Before it is initialized with: > > > > > block/genhd.c :: subsys_initcall(genhd_device_init); > > > > > If that's the case, we have an old bug that nobody noticed with static > > > > > structures, which are zeroed that time, but definitely not properly > > > > > initialized. I'll try to build loop non-modular now, and see if that > > > > > makes the bug appear here. > > > > > > > my .config with which I reproduc this on 2.6.24-rc2-mm1 reliably can be > > > > obtained from http://www.jikos.cz/jikos/junk/.config > > > > > > Hmm, that config doesn't do anything here, and if I make it boot, it > > > does not show the bug. > > > > > > Could you possibly enable kobject debugging and see if that exposes > > > something, maybe something goes wrong with the kset refcount and it gets > > > released while in use. > > > > > Hi, > > I would do that. > > That would be great. > > > BTW, The bug report as EIP at __list_add with CONFIG_DEBUG_LIST=y > > Yeah, that hints that the kset, which contains the list, is not > allocated at the time it is used, or it is already released (kfree) > again by some buggy logic. > > All this could not happen before, as the kset was statically in memory. > It may be an old bug, that just never crashed anything. We already fixed > a bunch of similar things, that showed up while doing this patch set. > Now with the DEBUG_KOBJECT set , nothing more info. But this time the EIP is at the strnlen (called by printk -- line 239 of kobject.c) EIP is at strnlen +0x9/0x20 EAX 6b6b6b6b EBX c05487c14 ecx 6b6b6b6b EDX fffe ---cut--- If you need more infomation, I will copy more (no camera in hand) Regards dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 5/8] Immediate Values - x86 Optimization
On Wednesday 14 November 2007 05:58:05 Mathieu Desnoyers wrote: > x86 optimization of the immediate values which uses a movl with code > patching to set/unset the value used to populate the register used as > variable source. For the record, I think the patching code gross overkill. A stop_machine (or lightweight variant using IPI) would be sufficient and vastly simpler. Trying to patch NMI handlers while they're running is already crazy. I'd keep this version up your sleeve for they day when it's needed. Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] New Kernel Bugs
On Tuesday November 13, [EMAIL PROTECTED] wrote: > On Tuesday 13 November 2007 07:08, Mark Lord wrote: > > Ingo Molnar wrote: > > .. > > > > > This is all QA-101 that _cannot be argued against on a rational basis_, > > > it's just that these sorts of things have been largely ignored for > > > years, in favor of the all-too-easy "open source means many eyeballs and > > > that is our QA" answer, which is a _good_ answer but by far not the most > > > intelligent answer! Today "many eyeballs" is simply not good enough and > > > nature (and other OS projects) will route us around if we dont change. > > > > .. > > > > QA-101 and "many eyeballs" are not at all in opposition. > > The latter is how we find out about bugs on uncommon hardware, > > and the former is what we need to track them and overall quality. > > > > A HUGE problem I have with current "efforts", is that once someone > > reports a bug, the onus seems to be 99% on the *reporter* to find > > the exact line of code or commit. Ghad what a repressive method. > > This is the only method that scales. That sounds overly hash, and the rest of you mail sounds much more moderate and sensible - I can only assume you were using hyperbole?? Putting the "onus on the reporter" is simply not going to work unless you have a business relationship. In the community, we are all volunteering our time (well, maybe my employer is volunteering my time to do community support, but the effect is the same). I would hope that the focus of developers is to empower bug reporters to provide further information (and as has been said, "git bisect" is a great empowerer). Some people will be incredibly help, especially if you ask politely and say thankyou. Others won't for any of a number of reasons - and maybe that means their bug won't get fixed. To my eyes, the "only method that scales" is investing effort in encouraging and training bug reporters. Some of that effort might not produce results, but when others among those you have encouraged start answering the newbee questions on the list and save you the time, you get a distinct feeling that it was all worth while. I think we are in agreement - I just wanted to take issue with that one sentence :-) The rest is great. NeilBrown > > Developer has only 24 hours in each day, and sometimes he needs to eat, > sleep, and maybe even pay attention to e.g. his kids. > > But bug reporters are much more numerous and they have more > hours in one day combined. > > BUT - it means that developers should try to increase user base, > not scare users away. > > > And if the "developer" who broke the damn thing, or who at least > > "claims" to be supporting that code, cannot "reproduce" the bug, > > they drop it completely. > > Developer should let reporter know that reporter needs to help > a bit here. Sometimes a bit of hand holding is needed, but it > pays off because you breed more qualified testers/bug reporters. > > > Contrast that flawed approach with how Linus does things.. > > he thinks through the symptoms, matches them to the code, > > and figures out what the few possibilities might be, > > and feeds back some trial balloon patches for the bug reporter to try. > > > > MUCH better. > > > > And remember, *I'm* an old-time Linux kernel developer.. just think about > > the people reporting bugs who haven't been around here since 1992.. > > Yes. Developers should not grow more and more unhelpful > and arrogant towards their users just because inexperienced > users send incomplete/poorly written bug reports. > They need to provide help, not humiliate/ignore. > > I think we agree here. > -- > vda > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] New Kernel Bugs
On Wednesday November 14, [EMAIL PROTECTED] wrote: > On Wed, Nov 14, 2007 at 09:38:20AM -0800, Randy Dunlap wrote: > > On Wed, 14 Nov 2007 15:08:47 +0100 Ingo Molnar wrote: > > > so please stop this "too busy and too noisy" nonsense already. It was > > > nonsense 10 years ago and it's nonsense today. In 10 years the kernel > > > grew from a 1 million lines codebase to an 8 million lines codebase, so > > > what? Deal with it and be intelligent about filtering your information > > > influx instead of imposing a hard pre-filtering criteria that restricts > > > intelligent processing of information. > > > > So you have a preferred method of handling email. Please don't > > force it on the rest of us. > > I'd be curious for any pointers on tools, actually. I "read" (ok, skim) > lkml but still overlook relevant bug reports occasionally. > (Fortunately, between Trond and Andrew and others forwarding things it's > not actually a problem, but I'm still curious). Virtual Folders. I use VM mode in EMACS, but I believe some other mail readers have the same functionality. I have a virtual folder called "nfs" which shows me all mail in my inbox which has the string 'nfs' or 'lockd' in a To, Cc, or Subject field. When I visit that folder, I see all mail about nfs, whether it was sent to me personally, or to a relevant list, or to lkml. Admittedly if someone doesn't bother to choose a meaningful Subject, then I might miss that. I think this mostly happens when Andrew sends a "-mm" announcement, asked people to change the subject line when following up, and someone follows up without changing the subject line and say "NFS doesn't work any more". I have another virtual folder which matches "md" and "raid" and "mdadm" in any header (so when the people from coraid.com talk about ATA over Ethernet, that gets badly filed, but it is a small cost). Then I have the "bkernel" (boring kernel) folder for all mail from lkml that doesn't mention nfs or raid or md, and isn't from or to me. That folder I skim every week or so and just read the juicy debates and look for interesting tidbits from interesting people - then delete the whole folder, mostly unread. I don't think I could cope with mail without virtual folders. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc2-mm1
On Thu, 2007-11-15 at 09:01 +0800, Dave Young wrote: > On Nov 15, 2007 5:27 AM, Kay Sievers <[EMAIL PROTECTED]> wrote: > > On Wed, 2007-11-14 at 20:19 +0100, Jiri Kosina wrote: > > > On Wed, 14 Nov 2007, Kay Sievers wrote: > > > > > > > Could it be an init-order problem, where something tries to use the > > > > block subsystem? Before it is initialized with: > > > > block/genhd.c :: subsys_initcall(genhd_device_init); > > > > If that's the case, we have an old bug that nobody noticed with static > > > > structures, which are zeroed that time, but definitely not properly > > > > initialized. I'll try to build loop non-modular now, and see if that > > > > makes the bug appear here. > > > > > my .config with which I reproduc this on 2.6.24-rc2-mm1 reliably can be > > > obtained from http://www.jikos.cz/jikos/junk/.config > > > > Hmm, that config doesn't do anything here, and if I make it boot, it > > does not show the bug. > > > > Could you possibly enable kobject debugging and see if that exposes > > something, maybe something goes wrong with the kset refcount and it gets > > released while in use. > > > Hi, > I would do that. That would be great. > BTW, The bug report as EIP at __list_add with CONFIG_DEBUG_LIST=y Yeah, that hints that the kset, which contains the list, is not allocated at the time it is used, or it is already released (kfree) again by some buggy logic. All this could not happen before, as the kset was statically in memory. It may be an old bug, that just never crashed anything. We already fixed a bunch of similar things, that showed up while doing this patch set. Thanks, Kay - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Attempt to get eject failures back to ioctl(CDROMEJECT)
On Wednesday 14 November 2007 23:39:31 Jens Axboe wrote: > On Wed, Nov 14 2007, Rusty Russell wrote: > > Hi Jens, > > > > As you asked for some time ago. Of course, it turns out that the > > eject command ignores the error anyway, but it's nice that it now errors. > > > > Not entirely comfortable with this patch: there's a req->errors but > > that seems to have some existing semantics I'm not sure of, so I simply > > added a new way of flagging an error. > > It is a bit of a hack, but it's not really your fault. ->errors is > somewhat messy and has different meaning depending on the request type. > I'll add your patch and then do a sanitize on top of it, so that we can > switch things over to a unified ->errno instead. Thanks! Oh, I also noticed this in scsi_tgt_lib: From: Rusty Russell <[EMAIL PROTECTED]> Date: Fri, 9 Nov 2007 20:04:54 +1100 Subject: [PATCH] scsi_tgt_lib: BUG_ON() impossible condition. If blk_rq_map_sg returns more than was allocated, it's a bug, and something's already been overwritten. BUG_ON() is probably the right thing here. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> --- drivers/scsi/scsi_tgt_lib.c | 11 +++ 1 files changed, 3 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/scsi_tgt_lib.c b/drivers/scsi/scsi_tgt_lib.c index a91761c..66266c8 100644 --- a/drivers/scsi/scsi_tgt_lib.c +++ b/drivers/scsi/scsi_tgt_lib.c @@ -367,14 +367,9 @@ static int scsi_tgt_init_cmd(struct scsi_cmnd *cmd, gfp_t gfp_mask) dprintk("cmd %p cnt %d %lu\n", cmd, cmd->use_sg, rq_data_dir(rq)); count = blk_rq_map_sg(rq->q, rq, cmd->request_buffer); - if (likely(count <= cmd->use_sg)) { - cmd->use_sg = count; - return 0; - } - - eprintk("cmd %p cnt %d\n", cmd, cmd->use_sg); - scsi_free_sgtable(cmd); - return -EINVAL; + BUG_ON(count > cmd->use_sg); + cmd->use_sg = count; + return 0; } /* TODO: test this crap and replace bio_map_user with new interface maybe */ -- 1.5.2.5 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [perfmon] Re: [perfmon2] perfmon2 merge news
David Miller writes: > From: Paul Mackerras <[EMAIL PROTECTED]> > Date: Thu, 15 Nov 2007 12:11:10 +1100 > > > The third (hard to extend cleanly) is a good point, and is a valid > > criticism of the current set of perfmon2 system calls, I think. > > However, the goal of being able to extend the interface tends to be in > > opposition to the goal of having strong typing of the interface. > > Things like a multiplexed syscall or an ioctl are much easier to > > extend but that is at the expense of losing strong typing. > > I disagree. > > With netlink we can just add new attributes when a new need arises for > a particular interface. The attribute code describes the type > precisely, so there is no loss of strong typing at all. Well you must mean something different by "strong typing" from the rest of us. Strong typing means that the compiler can check that you have passed in the correct types of arguments, but the compiler doesn't have any visibility into what structures are valid in netlink messages. In any case, I think that adding a structure size argument to the current perfmon2 system calls where appropriate would mean that we could extend them cleanly later on if necessary. It would mean that we could add fields at the end, and that the kernel could know what version of the structures that userspace was using. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rt-preempt: problem compiling rt-preempt 2.6.23.1-rt11 on MIPS
Thomas Gleixner wrote: I fear you are the one who is in charge to get mips working again :) But as always, there are bad news and good news: As far as I heard last week John Cooper is looking into this as well. I'm not actively working on it but AFAIK I may have been the last one to touch it when I did the mips version back at Timesys. Although I was able to get a functional port of the work there were gremlins I never had sufficient time to address. That is a relatively minor issue. The more daunting problem stems from limitations in the MIPS ABI which makes the latency trace support problematic. Rather than rehash the issue: http://lists.linuxcoding.com/kernel/2005-q4/msg10163.html Until we have a usable instrumentation solution in place, characterization, debug, and support of PREEMPT_RT for MIPS is going to be a challenge. -john -- [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] libata-scsi: be tolerant of 12-byte ATAPI commands in 16-byte CDBs
Mark Lord wrote: > Sebastian Kemper reported that issuing CD/DVD commands under libata > is not fully compatible with ide-scsi. In particular, the > GPCMD_SET_STREAMING > was being rejected at the host level in some instances. > > The reason is that libata-scsi insists upon the cmd_len field exactly > matching > the SCSI opcode being issued, whereas ide-scsi tolerates 12-byte commands > contained within a 16-byte (cmd_len) CDB. > > There doesn't seem to be a good reason for us to not be compatible there, > so here is a patch to fix libata-scsi to permit SCSI opcodes so long as > they fit within whatever size CDB is provided. > > Signed-off-by: Mark Lord <[EMAIL PROTECTED]> applied to #tj-upstream-fixes. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
On Thursday 15 November 2007 12:11, Herbert Xu wrote: > On Wed, Nov 14, 2007 at 05:03:25PM -0800, Christoph Lameter wrote: > > Well this is likely the result of the SLUB regression. If you allocate an > > order 1 page then the zone locks need to be taken. SLAB queues the a Yeah, it appears this is what happened. The lockless page allocator fastpath appears on the list and the slowpaths disappear after Herbert's patches. SLAB is doing its own thing, so it avoids that pitfall. > > couple of higher order pages and can so serve a couple of requests > > without going into the page allocator whereas SLUB has to go directly to > > the page allocator for allocate and free. I guess that needs fixing in > > the page allocator. Or do I need to add a mechanism to buffer higher > > order page allcoations to SLUB? > > Actually this serves to discourage people from using high-order > allocations which IMHO is a good thing :) Yeah I completely agree. The right fix is in the caller... The bug / suboptimal allocation would not have been found in tcp if not for this ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] sata_nv: don't use legacy DMA in ADMA mode
Tejun Heo wrote: > If so, can you please add that switching into register mode is okay as > long as there's no other ADMA commands in flight and add > WARN_ON((qc->flags & ATA_QCFLAG_RESULT_TF) && link->sactive)? More accurately, link->sactive test can be substituted with (ap->qc_allocated & ~(1 << qc->tag)). -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] sata_nv: don't use legacy DMA in ADMA mode
Hello, Robert Hancock wrote: > We need to run any DMA command with result taskfile requested in ADMA mode > when the port is in ADMA mode, otherwise it may try to use the legacy DMA > engine > in ADMA mode which is not allowed. Enforce this with BUG_ON() since data > corruption could potentially result if this happened. > > Signed-off-by: Robert Hancock <[EMAIL PROTECTED]> > > --- linux-2.6.24-rc1-git10/drivers/ata/sata_nv.c 2007-11-01 > 20:01:32.0 -0600 > +++ linux-2.6.24-rc1-git10edit/drivers/ata/sata_nv.c 2007-11-13 > 19:01:09.0 -0600 > @@ -791,11 +797,13 @@ > > static void nv_adma_tf_read(struct ata_port *ap, struct ata_taskfile *tf) > { > - /* Since commands where a result TF is requested are not > -executed in ADMA mode, the only time this function will be called > -in ADMA mode will be if a command fails. In this case we > -don't care about going into register mode with ADMA commands > -pending, as the commands will all shortly be aborted anyway. */ > + /* Other than when internal or pass-through commands are executed, > +the only time this function will be called in ADMA mode will be > +if a command fails. In the failure case we don't care about going > +into register mode with ADMA commands pending, as the commands will > +all shortly be aborted anyway. We assume that NCQ commands are not > +issued via passthrough and so this will not abort any commands in > +that case. */ > nv_adma_register_mode(ap); So, now if an ATA DMA command is issued w/ RESULT_TF set, it's issued using ADMA. Then when nv_adma_tf_read() is called on success path, it switches into register mode and read TF which is okay as long as there's no other ADMA commands in flight and that's why you wrote about not issuing NCQ commands via NCQ. Am I understanding it correctly? If so, can you please add that switching into register mode is okay as long as there's no other ADMA commands in flight and add WARN_ON((qc->flags & ATA_QCFLAG_RESULT_TF) && link->sactive)? Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] sata_nv: fix ATAPI issues with memory over 4GB (v3)
Robert Hancock wrote: > This fixes some problems with ATAPI devices on nForce4 controllers in ADMA > mode > on systems with memory located above 4GB. We need to delay setting the 64-bit > DMA mask until the PRD table and padding buffer are allocated so that they > don't > get allocated above 4GB and break legacy mode (which is needed for ATAPI > devices). > > Signed-off-by: Robert Hancock <[EMAIL PROTECTED]> applied to #tj-upstream-fixes. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: OT: Does Linux have any "Perfect Code"
From: Russell Leighton <[EMAIL PROTECTED]> Date: Wed, 14 Nov 2007 20:21:04 -0500 > > At the risk of being egocentric, the cyclic subsystem (which is > > executed at least 100 times per second on every Solaris system) > > had its last substantial fix over six years ago, and its last fix > > of any flavor over three years ago: Yeah, if you develop at the glacial pace Solaris does, don't add any features to cyclics or work on scalability improvment, sure it can be bug free and untouched for 6 years. I think all of this talk of perfect code is just trolling by the Solaris folks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rt-preempt: problem compiling rt-preempt 2.6.23.1-rt11 on MIPS
On Wed, 14 Nov 2007, Tim Bird wrote: > > I'm just about to release 2.6.24-rc2-rt1 and I'm sure mips as well as > > powrepc is badly broken. Any help in getting these back up and working > > would be greatly appreciated. > > I'll probably have some pretty basic questions. > If you don't mind an RT newbie helping, I'll do what I can. :-) We can use all the help we can get :-) > BTW - I was just trying to cross-compile the IBM test programs > (rt-test-0.6) today, and had some problems. I didn't want to > bug anyone until I took a closer look at it, but would it be > best to report problems with that here on LKML, on > linux-rt-users, or just directly to Darren Hart? Things related to rt-test should go to Darren Hart and to linux-rt-users. We only CC LKML on RT kernel issues. -- Steve - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [perfmon] Re: [perfmon2] perfmon2 merge news
From: Paul Mackerras <[EMAIL PROTECTED]> Date: Thu, 15 Nov 2007 12:11:10 +1100 > The third (hard to extend cleanly) is a good point, and is a valid > criticism of the current set of perfmon2 system calls, I think. > However, the goal of being able to extend the interface tends to be in > opposition to the goal of having strong typing of the interface. > Things like a multiplexed syscall or an ioctl are much easier to > extend but that is at the expense of losing strong typing. I disagree. With netlink we can just add new attributes when a new need arises for a particular interface. The attribute code describes the type precisely, so there is no loss of strong typing at all. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
OT: Does Linux have any "Perfect Code"
Bryan Cantrill of Sun (ala DTrace) has a notion of perfect code: http://blogs.sun.com/bmc/entry/on_i_dreaming_in_code He also has some examples (from bottom comment section of above): Can you list a small number of examples of "software perfection"? Posted by Russell Leighton on November 14, 2007 at 04:02 AM PST # Russell, My canonical small example of perfection in Solaris would be Jeff Bonwick's mod-by-a-billion code in hrt2ts(): http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/ common/os/timers.c#875 Solaris of course has lots of bigger, more complicated examples. Now on the one hand, one wants to refrain from pointing to thousands of lines of code and saying that there are no bugs therein, but on the other, there are many subsystems that have been in place and in heavy use for years without defect or modification. At the risk of being egocentric, the cyclic subsystem (which is executed at least 100 times per second on every Solaris system) had its last substantial fix over six years ago, and its last fix of any flavor over three years ago: http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/ common/os/cyclic.c Modesty (and the lack, of course, of a proof of its correctness) prevents me from calling the cyclic subsystem perfect -- but such as unknown defects remain, there are damn few of them, and we can say that they must be a result of highly usual (or at least, heretofore unseen) circumstances. A non-Solaris example -- and one that I've been known to use as the canonical example of the persistence of software -- is Super Mario Kart. This is a game that was developed (to its completion) fifteen years ago for the Super Nintendo console. Source code, to the best of my knowledge, is not publicly available and may indeed be lost -- but the binaries persist and (if my coworkers are any indication) remain in active use. Given the longevity of, say, Homer's Odyssey, there is reason to believe that Super Mario Kart will survive in perpetuity -- that thousands of years from now, twenty-somethings somewhere will be using the software exactly as it is used today. Is this perfection? Perhaps not -- but it also might not be discernible from perfection... Posted by Bryan Cantrill on November 14, 2007 at 07:51 AM PST # Does Linux have any such examples true software perfection? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rt-preempt: problem compiling rt-preempt 2.6.23.1-rt11 on MIPS
Steven Rostedt wrote: > Looks like that may be just an artifact from older days. > > I'll do some more tests, and it it truely is, I'll go and remove the added > headers. Thanks. > I'm just about to release 2.6.24-rc2-rt1 and I'm sure mips as well as > powrepc is badly broken. Any help in getting these back up and working > would be greatly appreciated. I'll probably have some pretty basic questions. If you don't mind an RT newbie helping, I'll do what I can. :-) -- Tim BTW - I was just trying to cross-compile the IBM test programs (rt-test-0.6) today, and had some problems. I didn't want to bug anyone until I took a closer look at it, but would it be best to report problems with that here on LKML, on linux-rt-users, or just directly to Darren Hart? -- Tim = Tim Bird Architecture Group Chair, CE Linux Forum Senior Staff Engineer, Sony Corporation of America = - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: on x86_64, correct reading of PC RTC when update in progress in time_64.c
Correct potentially unstable PC RTC time register reading in time_64.c Stop the use of an incorrect technique for reading the standard PC RTC timer, which is documented to "disconnect" time registers from the bus while updates are in progress. The use of UIP flag while interrupts are disabled to protect a 244 microsecond window is one of the Motorola spec sheet's documented ways to read the RTC time registers reliably. The patch updates the misleading comments and also minimizes the amount of time that the kernel disables interrupts during the reading. Signed-off-by: David P. Reed <[EMAIL PROTECTED]> --- Index: linux-2.6/arch/x86/kernel/time_64.c === --- linux-2.6.orig/arch/x86/kernel/time_64.c +++ linux-2.6/arch/x86/kernel/time_64.c @@ -160,22 +160,30 @@ unsigned long read_persistent_clock(void unsigned long flags; unsigned century = 0; - spin_lock_irqsave(_lock, flags); + retry:spin_lock_irqsave(_lock, flags); + /* if UIP is clear, then we have >= 244 microseconds before RTC +* registers will be updated. Spec sheet says that this is the +* reliable way to read RTC - registers invalid (off bus) during update +*/ + if ((CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP)) { + spin_unlock_irqrestore(_lock, flags); + cpu_relax(); + goto retry; + } - do { - sec = CMOS_READ(RTC_SECONDS); - min = CMOS_READ(RTC_MINUTES); - hour = CMOS_READ(RTC_HOURS); - day = CMOS_READ(RTC_DAY_OF_MONTH); - mon = CMOS_READ(RTC_MONTH); - year = CMOS_READ(RTC_YEAR); + /* now read all RTC registers while stable with interrupts disabled */ + + sec = CMOS_READ(RTC_SECONDS); + min = CMOS_READ(RTC_MINUTES); + hour = CMOS_READ(RTC_HOURS); + day = CMOS_READ(RTC_DAY_OF_MONTH); + mon = CMOS_READ(RTC_MONTH); + year = CMOS_READ(RTC_YEAR); #ifdef CONFIG_ACPI - if (acpi_gbl_FADT.header.revision >= FADT2_REVISION_ID && - acpi_gbl_FADT.century) - century = CMOS_READ(acpi_gbl_FADT.century); + if (acpi_gbl_FADT.header.revision >= FADT2_REVISION_ID && + acpi_gbl_FADT.century) + century = CMOS_READ(acpi_gbl_FADT.century); #endif - } while (sec != CMOS_READ(RTC_SECONDS)); - spin_unlock_irqrestore(_lock, flags); /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix warning for token-ring from sysctl checker
Olof Johansson <[EMAIL PROTECTED]> writes: > On Wed, Nov 14, 2007 at 08:45:28AM -0800, Randy Dunlap wrote: >> On Wed, 14 Nov 2007 08:56:20 -0700 Eric W. Biederman wrote: >> >> > David Miller <[EMAIL PROTECTED]> writes: >> > >> > > From: Olof Johansson <[EMAIL PROTECTED]> >> > > Date: Tue, 13 Nov 2007 01:23:13 -0600 >> > > >> > >> As seen when booting ppc64_defconfig: >> > >> >> > >> sysctl table check failed: /net/token-ring .3.14 procname does not match >> > > binary path procname >> > >> >> > >> >> > >> Signed-off-by: Olof Johansson <[EMAIL PROTECTED]> >> > > >> > > Patch applied, thanks Olof. >> > >> > No objections but I think we already have this fixed in the -mm tree. >> >> Yes, I patched it several weeks ago and it's been in -mm for a while >> now. Apparently too long. > > Ah, sorry for the duplicate patch then. I must have missed it at the > original posting (and didn't search that far back and/or -mm before > posting it myself). As long as the patch gets merged and makes it's way upstream so people can use the token-ring code I don't really care who posts it. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
On Wed, Nov 14, 2007 at 05:03:25PM -0800, Christoph Lameter wrote: > > Well this is likely the result of the SLUB regression. If you allocate an > order 1 page then the zone locks need to be taken. SLAB queues the a > couple of higher order pages and can so serve a couple of requests without > going into the page allocator whereas SLUB has to go directly to the page > allocator for allocate and free. I guess that needs fixing in the page > allocator. Or do I need to add a mechanism to buffer higher order page > allcoations to SLUB? Actually this serves to discourage people from using high-order allocations which IMHO is a good thing :) Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [perfmon] Re: [perfmon2] perfmon2 merge news
David Miller writes: > From: Paul Mackerras <[EMAIL PROTECTED]> > Date: Thu, 15 Nov 2007 10:12:22 +1100 > > > *I* never had a problem with a few extra system calls. I don't > > understand why you (apparently) do. > > We're stuck with them forever, they are hard to version and extend > cleanly. > > Those are my main objections. The first is valid (for suitable values of "forever") but applies to any user/kernel interface, not just system calls. As for the second (hard to version) I don't see why it applies to syscalls specifically more than to other interfaces. It's just a matter of designing it correctly in the first place. For example, the sys_swapcontext system call we have on powerpc takes an argument which is the size of the ucontext_t that userland is using, which allows us to extend it in future if necessary. (Note that I'm not saying that the current perfmon2 interfaces are well-designed in this respect.) The third (hard to extend cleanly) is a good point, and is a valid criticism of the current set of perfmon2 system calls, I think. However, the goal of being able to extend the interface tends to be in opposition to the goal of having strong typing of the interface. Things like a multiplexed syscall or an ioctl are much easier to extend but that is at the expense of losing strong typing. Something like my transaction() (or your weird kind of read() :) also provides extensibility but loses type safety to some degree. Also, as Andi says, this is core CPU state that we are dealing with, not some I/O device, so treating the whole of perfmon2 (or any performance monitoring infrastructure) as a driver doesn't fit very well, and in fact system calls are appropriate. Just like we don't try to make access to debugging facilities fit into a driver, we shouldn't make performance monitoring fit into a driver either. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rt-preempt: problem compiling rt-preempt 2.6.23.1-rt11 on MIPS
On Thu, 15 Nov 2007, Thomas Gleixner wrote: > On Wed, 14 Nov 2007, Tim Bird wrote: > > Added Steven and John to CC > > > I applied the patches in patch-2.6.23.1-rt11-broken-out.tar.bz2 > > to a Linux kernel version 2.6.23.1 (along with a few other > > board specific patches). > > > > I got the following compilation error: > > > > GEN /home/tbird/work/rt-preempt/build/tx49/Makefile > > CHK include/linux/version.h > > CHK include/linux/utsrelease.h > > CALL > > /home/tbird/work/rt-preempt/linux-2.6.23.1-rt11/scripts/checksyscalls.sh > > CHK include/linux/compile.h > > CC kernel/latency_trace.o > > /home/tbird/work/rt-preempt/linux-2.6.23.1-rt11/kernel/latency_trace.c:28:21: > > error: asm/rtc.h: No such file or directory > > make[2]: *** [kernel/latency_trace.o] Error 1 > > make[1]: *** [kernel] Error 2 > > make: *** [vmlinux] Error 2 > > > > Indeed, there is no include/asm-mips/rtc.h. Looks like that may be just an artifact from older days. I'll do some more tests, and it it truely is, I'll go and remove the added headers. > > > > I commented out the include line in latency_trace.c, and everything > > compiled fine. I'm not sure what is needed in an arch-specific rtc.h, > > but compiling without it for the mips arch caused no problems. > > Should I create a patch with a stub for rtc.h for mips? > > Hm, dunno why the rtc.h include was added. > > > As an aside, this has me worried. Is anyone else doing any > > RT Preempt testing or work on MIPS platforms, or am I forging > > new ground? :-) > > I fear you are the one who is in charge to get mips working again :) > But as always, there are bad news and good news: As far as I heard > last week John Cooper is looking into this as well. I'm just about to release 2.6.24-rc2-rt1 and I'm sure mips as well as powrepc is badly broken. Any help in getting these back up and working would be greatly appreciated. -- Steve - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86 32-bit machine check handler
On Tue, 2007-11-13 at 15:15 +0100, Andi Kleen wrote: > Max Asbock <[EMAIL PROTECTED]> writes: > > > Now that the 32-bit and 64-bit x86 machine check handlers live next to > > each other a certain asymmetry in functionality is apparent. Notably, > > the 64-bit machine check handler implements a timer that periodically > > polls for silent machine check errors and makes them accessible to user > > space through /dev/mcelog. > > Actually 32bit implements that too (non-fatal.c). But it misses some > of the more advanced functionality like AMD Threshold Interrupts. > > > Are there reasons the x86 32-bit machine > > check handler couldn't do the same? > > The 32bit machine check code has some serious design problems. The > best would be probably to just move 32bit over to the 64bit code too. In > fact there was a patch to do that some time ago, but it ran into some > minor problems and was unfortunately never merged. But it would be the > right thing to do. I found patch from about three years ago that implemented a 32-bit version of the x86_64 machine check handler. Do you know of any newer attempts? However, given the merge of x86, a single implementation should be able to handle both the 32-bit and 64-bit cases. I tried to build the 64-bit machine check handler (mce_64.c) for 32-bit to see what kind problems it would run into. So far I found a few things: - there is no idle_notifier_register in 32-bit x86 - there is no oops_begin in 32-bit x86 - register names are different (rip, cs) - some data types would have to adjusted to be 64 bit The issues seem to be surmountable. > The only missing functionality on the 64bit side would be support for > old non IA compliant old machine checks like P5 or WinChip. One option > would be to simply drop them. AFAIK these CPUs don't really have > anywhere near usable machine check capability anyways so dropping it > would not make much difference. Or alternatively keep p5.c/winchip.c > around. But if you look at them they don't do much except simple > printk with not much information and printk in a machine check handler > is always wrong because it can deadlock. I personally would prefer > dropping. > > And I think one or two K7 quirks are also missing on 64bit, but these > would be very easy to add. Other than that it should just work on > 32bit CPUs. > So it looks like giving 32-bit x86 the same machine check support as in 64-bit is both feasible and desirable. Are there any plans to do this or is anybody currently working on it? thanks, Max - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
On Wed, 14 Nov 2007, David Miller wrote: > > As a result, we may allocate more than a page of data in the > > non-TSO case when exactly one page is desired. Well this is likely the result of the SLUB regression. If you allocate an order 1 page then the zone locks need to be taken. SLAB queues the a couple of higher order pages and can so serve a couple of requests without going into the page allocator whereas SLUB has to go directly to the page allocator for allocate and free. I guess that needs fixing in the page allocator. Or do I need to add a mechanism to buffer higher order page allcoations to SLUB? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc2-mm1
On Nov 15, 2007 5:27 AM, Kay Sievers <[EMAIL PROTECTED]> wrote: > On Wed, 2007-11-14 at 20:19 +0100, Jiri Kosina wrote: > > On Wed, 14 Nov 2007, Kay Sievers wrote: > > > > > Could it be an init-order problem, where something tries to use the > > > block subsystem? Before it is initialized with: > > > block/genhd.c :: subsys_initcall(genhd_device_init); > > > If that's the case, we have an old bug that nobody noticed with static > > > structures, which are zeroed that time, but definitely not properly > > > initialized. I'll try to build loop non-modular now, and see if that > > > makes the bug appear here. > > > my .config with which I reproduc this on 2.6.24-rc2-mm1 reliably can be > > obtained from http://www.jikos.cz/jikos/junk/.config > > Hmm, that config doesn't do anything here, and if I make it boot, it > does not show the bug. > > Could you possibly enable kobject debugging and see if that exposes > something, maybe something goes wrong with the kset refcount and it gets > released while in use. > Hi, I would do that. BTW, The bug report as EIP at __list_add with CONFIG_DEBUG_LIST=y Regards dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rt-preempt: problem compiling rt-preempt 2.6.23.1-rt11 on MIPS
On Wed, 14 Nov 2007, Tim Bird wrote: Added Steven and John to CC > I applied the patches in patch-2.6.23.1-rt11-broken-out.tar.bz2 > to a Linux kernel version 2.6.23.1 (along with a few other > board specific patches). > > I got the following compilation error: > > GEN /home/tbird/work/rt-preempt/build/tx49/Makefile > CHK include/linux/version.h > CHK include/linux/utsrelease.h > CALL > /home/tbird/work/rt-preempt/linux-2.6.23.1-rt11/scripts/checksyscalls.sh > CHK include/linux/compile.h > CC kernel/latency_trace.o > /home/tbird/work/rt-preempt/linux-2.6.23.1-rt11/kernel/latency_trace.c:28:21: > error: asm/rtc.h: No such file or directory > make[2]: *** [kernel/latency_trace.o] Error 1 > make[1]: *** [kernel] Error 2 > make: *** [vmlinux] Error 2 > > Indeed, there is no include/asm-mips/rtc.h. > > I commented out the include line in latency_trace.c, and everything > compiled fine. I'm not sure what is needed in an arch-specific rtc.h, > but compiling without it for the mips arch caused no problems. > Should I create a patch with a stub for rtc.h for mips? Hm, dunno why the rtc.h include was added. > As an aside, this has me worried. Is anyone else doing any > RT Preempt testing or work on MIPS platforms, or am I forging > new ground? :-) I fear you are the one who is in charge to get mips working again :) But as always, there are bad news and good news: As far as I heard last week John Cooper is looking into this as well. Thanks, tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5][RFC] Physical PCI slot objects
On Tue, Nov 13, 2007 at 06:37:32PM -0700, Alex Chiang wrote: > Hi Gary, > > * Gary Hade <[EMAIL PROTECTED]>: > > On Tue, Nov 13, 2007 at 01:11:02PM -0700, Matthew Wilcox wrote: > > > On Tue, Nov 13, 2007 at 10:51:22AM -0800, Greg KH wrote: > > > > Ok, again, I want to see the IBM people sign off on this, after testing > > > > on all of their machines, before I'll consider this, as I know the IBM > > > > acpi tables are "odd". > > > > > > That seems a little higher standard than patches are normally held to. > > > How about the patches get sent to the appropriate people at IBM (who are > > > they?) > > > > I be one of them. :) I have been involved in many (but not all) > > of IBM's x86 based (IBM System x) servers with hotplug capable > > PCI slots. I have mostly worked on 'acpiphp' associated issues. > > Thanks for testing the series. It's much appreciated. > > > Have you possibly considered a kernel option as a kinder and > > gentler way of introducing the changes? > > That is a good idea. I will work on that. Thanks. This will allow everyone to focus on the systems where the changes are most beneficial and not waste a bunch of time trying to test everywhere. > > > > > IBM x3850 > > Slots 1-2: PCI-X under PCI root bridges > > Slots 3-6: PCIe under transparent P2P bridges > > Slot 1: PCI-X - populated > > Slot 2: PCI-X - !populated > > Slot 3: PCIe - populated > > Slot 4: PCIe - !populated > > Slot 5: PCIe - !populated > > Slot 6: PCIe - populated > > > > result is with 2.6.24-rc2 plus all 4 proposed patches > > Silly question, but I have to ask. :) Hey, this isn't a silly question. :) > > I sent out 5 patches -- is this simply a typo on your part, or > did you only apply 4/5 patches? Yes, it is just a typo. I did apply all 5 patches. > > > problem: acpiphp failed to register empty PCIe slots 4 and 5 > > Ok, so acpiphp wasn't going to register those slots anyway, since > they are empty. No, acpiphp should (and did before your changes) register all hotplug capable slots. All 6 slots (2 PCI-X, 4 PCIe) in that system are hotplug capable. Emptyness shouldn't matter. If the empty slots are not registered it is not be possible to successfully hotplug cards to them. Without your changes acpiphp loads with the following output. acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 acpiphp: Slot [1] registered acpiphp: Slot [2] registered acpiphp: Slot [3] registered acpiphp: Slot [4] registered acpiphp: Slot [5] registered acpiphp: Slot [6] registered With your changes I confirmed that an attempted hotplug to a boot-time vacant PCIe slot failed as expected. The driver saw the insertion event but didn't find anything to enable: acpiphp_glue: handle_hotplug_event_bridge: Bus check notify on \_SB_.VP05.CALG acpiphp_glue: handle_hotplug_event_bridge: re-enumerating slots under \_SB_.VP05.CALG acpiphp_glue: acpiphp_check_bridge: 0 enabled, 0 disabled > It would have bailed out after not seeing _ADR or > _EJ0 on those slots. Well, both _ADR and _EJ0 exist for each of the 4 PCIe slots. > > The acpi-pci-slot driver created those slots anyway, which is one > of the points of the patch -- to create sysfs entries even for > empty slots. > > > acpiphp_glue: found PCI-to-PCI bridge at PCI :0f:00.0 > > This is the real address of slot 4. No, the P2P parent bus is :0f and the P2P child bus is :10 so I believe the real address for slot 4 should be :10:00. kernel without your changes after loading acpiphp: # cat /sys/bus/pci/slots/4/address :10:00 kernel with your changes both before and after loading acpiphp: # cat /sys/bus/pci/slots/4/address :0f:00 > > > acpiphp_glue: found ACPI PCI Hotplug slot 4 at PCI :10:00 > > acpiphp: pci_hp_register failed with error -17 > > acpiphp_glue: acpiphp_register_hotplug_slot failed(err code = 0xffef) > [repeated 7x] > > We saw this message 8x, once for each SxFy object under your p2p > bridge. I actually somewhat did expect to see this error message > (hence the RFC part of my patch ;) > > I currently don't have a good way to determine if we've already > seen an empty slot under a p2p bridge, so we try to register > every SxFy object. Of course, a /sys/bus/pci/slots/4/ entry > already exists, so that's why we're getting -17 (-EEXIST). Of course, this kind of confusing noise would not be acceptable in the final version of your changes. > > > acpiphp_glue: found PCI-to-PCI bridge at PCI :14:00.0 > > acpiphp_glue: found ACPI PCI Hotplug slot 5 at PCI :15:00 > > acpiphp: pci_hp_register failed with error -17 > > acpiphp_glue: acpiphp_register_hotplug_slot failed(err code = 0xffef) > > Same explanation as above. > > > # find /sys/bus/pci/slots > > /sys/bus/pci/slots > > [snip] > > > /sys/bus/pci/slots/4 > > /sys/bus/pci/slots/4/address > > /sys/bus/pci/slots/5 > > /sys/bus/pci/slots/5/address > > Arguably, the right thing happened here. We got entries for empty >
Re: [PATCH 0/5, v2] Physical PCI slot objects
On Wed, Nov 14, 2007 at 12:36:05PM -0700, Alex Chiang wrote: > * Alex Chiang <[EMAIL PROTECTED]>: > > > > Actually, I just reworked my patch this morning, and believe that > > I have a much cleaner implementation now that should fix a lot of > > the errors you saw. Sorry, I lost access to the x3850 that I was using and may not be able to get back on it until early next week. I think I will go ahead and send you the comments in response to your earlier message that I was going to hold off on until I tried your revised patches. It may contain something you haven't considered. Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 [EMAIL PROTECTED] http://www.ibm.com/linux/ltc - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix warning for token-ring from sysctl checker
On Wed, Nov 14, 2007 at 08:45:28AM -0800, Randy Dunlap wrote: > On Wed, 14 Nov 2007 08:56:20 -0700 Eric W. Biederman wrote: > > > David Miller <[EMAIL PROTECTED]> writes: > > > > > From: Olof Johansson <[EMAIL PROTECTED]> > > > Date: Tue, 13 Nov 2007 01:23:13 -0600 > > > > > >> As seen when booting ppc64_defconfig: > > >> > > >> sysctl table check failed: /net/token-ring .3.14 procname does not match > > > binary path procname > > >> > > >> > > >> Signed-off-by: Olof Johansson <[EMAIL PROTECTED]> > > > > > > Patch applied, thanks Olof. > > > > No objections but I think we already have this fixed in the -mm tree. > > Yes, I patched it several weeks ago and it's been in -mm for a while > now. Apparently too long. Ah, sorry for the duplicate patch then. I must have missed it at the original posting (and didn't search that far back and/or -mm before posting it myself). -Olof - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
From: Nick Piggin <[EMAIL PROTECTED]> Date: Thu, 15 Nov 2007 11:21:36 +1100 > On Thursday 15 November 2007 10:46, David Miller wrote: > > From: Herbert Xu <[EMAIL PROTECTED]> > > Date: Wed, 14 Nov 2007 19:48:44 +0800 > > > > Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> > > > > Applied and I'll queue it up for -stable too. > > Good result. Thanks, everyone! This case is a good example to use the next time a stupid thread starts up about bug reports not being looked into. To me it's seems clearly more a matter of the quality of the bug report. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [perfmon] Re: [perfmon2] perfmon2 merge news
Andi Kleen writes: > > This only works when counting (not sampling) and only for self-monitoring. > > It works for global monitoring too. How would you provide access to the counters of another process? Through an extension to ptrace perhaps? Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
On Thursday 15 November 2007 10:46, David Miller wrote: > From: Herbert Xu <[EMAIL PROTECTED]> > Date: Wed, 14 Nov 2007 19:48:44 +0800 > > Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> > > Applied and I'll queue it up for -stable too. Good result. Thanks, everyone! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
rt-preempt: problem compiling rt-preempt 2.6.23.1-rt11 on MIPS
I applied the patches in patch-2.6.23.1-rt11-broken-out.tar.bz2 to a Linux kernel version 2.6.23.1 (along with a few other board specific patches). I got the following compilation error: GEN /home/tbird/work/rt-preempt/build/tx49/Makefile CHK include/linux/version.h CHK include/linux/utsrelease.h CALL /home/tbird/work/rt-preempt/linux-2.6.23.1-rt11/scripts/checksyscalls.sh CHK include/linux/compile.h CC kernel/latency_trace.o /home/tbird/work/rt-preempt/linux-2.6.23.1-rt11/kernel/latency_trace.c:28:21: error: asm/rtc.h: No such file or directory make[2]: *** [kernel/latency_trace.o] Error 1 make[1]: *** [kernel] Error 2 make: *** [vmlinux] Error 2 Indeed, there is no include/asm-mips/rtc.h. I commented out the include line in latency_trace.c, and everything compiled fine. I'm not sure what is needed in an arch-specific rtc.h, but compiling without it for the mips arch caused no problems. Should I create a patch with a stub for rtc.h for mips? As an aside, this has me worried. Is anyone else doing any RT Preempt testing or work on MIPS platforms, or am I forging new ground? :-) -- Tim = Tim Bird Architecture Group Chair, CE Linux Forum Senior Staff Engineer, Sony Corporation of America = - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [perfmon] Re: [perfmon2] perfmon2 merge news
Andi, On Wed, Nov 14, 2007 at 03:24:11PM +0100, Andi Kleen wrote: > On Wed, Nov 14, 2007 at 05:09:09AM -0800, Stephane Eranian wrote: > > > > Partially true. The file descriptor becomes really useful when you sample. > > You leverage the file descriptor to receive notifications of counter > > overflows > > and full sampling buffer. You extract notification messages via read() and > > you can > > use SIGIO, select/poll. > > Hmm, ok for the event notification we would need a nice interface. Still > have my doubts a file descriptor is the best way to do this though. > Why do you think the existing interfaces are not a good fit for this? Is this just because of your problem with file descriptors? >From my experience read(), select(), and SIGIO are fine. I know many tools use >that. As for the file descriptor, you would need to replace that with another identifier of some sort. As I pointed out in another message on this thread, you don't want to use a pid-based identifier. This is not usable when you monitor other threads and you want to read out the results after their death. > > Are you suggesting something like: pfm_write_pmcs(fd, 0, 0x1234)? > > See my example below. > > > > That would be quite expensive when you have lots of registers to setup: one > > syscall per register. The perfmon syscalls to read/write registers accept > > vector > > of arguments to amortize the cost of the syscall over multiple registers > > (similar to poll(2)). > > > First system calls are not that slow on Linux. Measure it. > If people do not like vector arguments, then I think I can live with N system calls to program N registers. Now you have two choices for passing the arguments: - a pointer to a struct struct pfarg_pmc { uint64_t reg_value; uint16_t reg_num; } pmc0; pmc0.reg_value = 0; pmc0.reg_value = 0x1234; pfm_write_pmcs(fd, ); - explicitly passing every field: pfm_write_pmcs(fd, 0x0, 0x1234); Given that event set and multiplexing would not be in initially, we would want to allow for them to be added later without having to create yet another system call, right? Of course the same approach would work for the data registers at least for counting. > > With many tools, registers are not just setup once. During certain > > measurements, > > data registers may be read multiple times. When you sample or multiplex at > > I think you optimize the wrong thing here. > > There are basically two cases I see: > > - Global measurement of lots of things: I am not sure I understand what you mean by 'lots of things'? Are you still talking per-thread and self-monitoring? > Things are slow anyways with large context switch overheads. The > overheads are large anyways. Doing one or more system calls probably > does not matter much. Most important is a clean interface. > > - Exact measurement of the current process. For that you need very > low latencies. Any system call is too slow. That is why CPUs have > instructions like RDPMC that allow to read those registers with > minimal latency in user space. Interface should support those. > I don't have a problem with that. And in fact, I already support that at least on Itanium. I had that in there for X86 but I dropped it after you said that you would enable cr4.pce globally. I don't have a problem adding it back for self-monitoring sessions. > Also for this case programming time does not matter too much. You > just program once and then do RDPMC before code to measure and then > afterwards and take the difference. The actual counter setup is out > of the latency critical path. > Agreed. > > > It depends on what you are doing. Here, this was not really necessary. It > > was > > meant to show how you can program the data registers as well. Perfmon2 > > provides > > default values for all data registers. For counters, the value is > > guaranteed to > > be zero. > > > > But it is important to note that not all data registers are counters. That > > is the > > case of Itanium 2, some are just buffers. On AMD Barcelona IBS several are > > buffers as > > well, and some may need to be initialized to non zero value, i.e., the IBS > > sampling > > period. > > Setting period should be a separate call. Mixing the two together into one > does not look like a nice interface. > Periods are setup by data register. Given that there is already a call to program the data register why add another one? You don't need to treat the sampling period differently from the register value. This just a value that will cause the register to overflow after an explicit number of occurrences. > > With event-based sampling, the period is expressed as the number of > > occurrences > > of an event. For instance, you can say: " take a sample every 2000 L2 cache > > misses". > > The way you express this with perfmon2 is that
Re: [bug] SLOB crash, 2.6.24-rc2
On Wed, Nov 14, 2007 at 03:41:43PM -0800, David Miller wrote: > From: Matt Mackall <[EMAIL PROTECTED]> > Date: Wed, 14 Nov 2007 17:37:13 -0600 > > > No, the usual strategy for debugging problems -outside- SLOB is to > > switch to another allocator with more extensive debugging facilities. > > Ok, so the thing we still can do is do a dump_stack() at the > list debugging assertion trigger points. It's also pretty easy to add some debugging code to make SLOB walk all its lists at alloc/free time. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Repeated oopses with inode == 0xffffffff in 2.6.23
On 11/14/2007 06:51 PM, Chuck Ebbert wrote: > What does it mean when a pointer to an inode has a value of -1? > > https://bugzilla.redhat.com/show_bug.cgi?id=334181 > > > /usr/src/debug/kernel-2.6.22/linux-2.6.22.x86_64/block/ll_rw_blk.c:3139 Oops, this is 2.6.22.9... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
question on odd APIC behavior
Hi, is there a way to so misprogramm an APIC that a physical interrupt results in two interrupts delivered? Regards Oliver - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/