Re: [PATCH] X86: fix typo PAT to X86_PAT
On Friday 18 January 2008 07:28:49 pm Dave Jones wrote: > On Fri, Jan 18, 2008 at 10:02:10PM +0100, Ingo Molnar wrote: > > > > * Dave Jones <[EMAIL PROTECTED]> wrote: > > > > > > you mean modifies MTRRs? Which code is that? (besides the > > > > /proc/mtrr userspace API) > > > > > > This exclusion is going to be a real pain in the ass for distro > > > kernels. It's impossible for example to build a kernel that will now > > > support the MTRR-alike registers on the AMD K6/early Cyrix etc and > > > also support PAT. > > > > > > Additionally, given people tend to update their kernels a lot more > > > often than they update to a whole new version of X, it means until > > > userspace has caught up, we can't ship a kernel with PAT supported, or > > > else X gets a lot slower due to the missing mtrr support. > > > > there's no exclusion enforced right now, and if a CPU is PAT-incapable > > (or if the kernel is booted nopat) then the MTRR bits should be usable. > > But if we boot with PAT enabled, and Xorg gets /proc/mtrr wrong, we'll > > see nasty crashes. If it gets them right, it should all still work just > > fine. Is this ok? Then, in a year or two, distros can disable write > > support to /proc/mtrr. Hm? > > A crazy idea just occured to me.. We could make /proc/mtrr an interface > to set PAT on a range of memory. This would make it transparently work > without any changes in X or anything else that sets them in userspace. goog idea... we need to make X86_PAT depend on MTRR in arch/x86/Kconfig YH -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] X86: disable X86_PAT really
[PATCH] X86: disable X86_PAT really when X86_PAT is not selected, we don't need to do anything in reserve_mattr and free_mattr also need to bail out if cpu not support PAT. Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 1036134..b3cdee1 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -57,12 +57,9 @@ static int pat_known_cpu(void) void pat_init(void) { +#ifdef CONFIG_X86_PAT u64 pat; -#ifndef CONFIG_X86_PAT - nopat(NULL); -#endif - if (!smp_processor_id() && !pat_known_cpu()) return; @@ -90,6 +87,7 @@ void pat_init(void) wrmsrl(MSR_IA32_CR_PAT, pat); printk(KERN_INFO "x86 PAT enabled: cpu %d, old 0x%Lx, new 0x%Lx\n", smp_processor_id(), boot_pat_state, pat); +#endif } #undef PAT @@ -135,9 +133,13 @@ static DEFINE_SPINLOCK(mattr_lock);/* protects memattr list */ int reserve_mattr(u64 start, u64 end, unsigned long attr, unsigned long *fattr) { +#ifdef CONFIG_X86_PAT struct memattr *ma = NULL, *ml; int err = 0; + if (!pat_wc_enabled) + return 0; + if (fattr) *fattr = attr; @@ -191,13 +193,20 @@ int reserve_mattr(u64 start, u64 end, unsigned long attr, unsigned long *fattr) spin_unlock(&mattr_lock); return err; +#else + return 0; +#endif } int free_mattr(u64 start, u64 end, unsigned long attr) { +#ifdef CONFIG_X86_PAT struct memattr *ml; int err = attr ? -EBUSY : 0; + if (!pat_wc_enabled) + return 0; + if (is_memory_any_valid(start, end)) return 0; @@ -221,6 +230,9 @@ int free_mattr(u64 start, u64 end, unsigned long attr) current->comm, current->pid, start, end, cattr_name(attr)); return err; +#else + return 0; +#endif } /* /dev/mem interface. Use the previous mapping */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: fix unconditional arch/x86/kernel/pcspeaker.c compiling
On 1/18/08, Michael Opdenacker <[EMAIL PROTECTED]> wrote: > Do you mean "almost nothing"? It still allocates and adds a platform > device, and the corresponding function always gets called at boot time. Nothing significant then. I don't see any added functionality from this file. -- Taral <[EMAIL PROTECTED]> "Please let me know if there's any further trouble I can give you." -- Unknown -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GEODE] Geode GX/LX watchdog timer (was 2.6.24-rc8 hangs at mfgpt-timer)
On Fri, Jan 18, 2008 at 06:06:24PM -0700, Jordan Crouse wrote: > I don't know how much of a hassle it would be for Andres to get a 2.6.24 > kernel running on the OLPC to make sure that this isn't a regression > in the timer tick code (I suspect it isn't a regression, but you never > know). I also think that it would probably be in our best interest to > default CONFIG_GEODE_MFGPT_TIMER to 'n' until we get this figured > out. Since most BIOSen don't have timers available, that shouldn't affect > too many people. Well, I've successfully used earlier version of this code with 2.6.22 on a PCEngines ALIX motherboard equipped with LX800/CS5536. It boots on a TinyBIOS. I will try 2.6.24 + this patch on these boards when I have some time. Willy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [7/8] CPA: Implement GBpages support in change_page_attr()
Teach c_p_a() to split and unsplit GB pages. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86/mm/pageattr_64.c | 150 -- 1 file changed, 119 insertions(+), 31 deletions(-) Index: linux/arch/x86/mm/pageattr_64.c === --- linux.orig/arch/x86/mm/pageattr_64.c +++ linux/arch/x86/mm/pageattr_64.c @@ -40,6 +40,9 @@ pte_t *lookup_address(unsigned long addr pud = pud_offset(pgd, address); if (!pud_present(*pud)) return NULL; + *level = 2; + if (pud_large(*pud)) + return (pte_t *)pud; pmd = pmd_offset(pud, address); if (!pmd_present(*pmd)) return NULL; @@ -53,30 +56,85 @@ pte_t *lookup_address(unsigned long addr return pte; } -static struct page *split_large_page(unsigned long address, pgprot_t prot, -pgprot_t ref_prot) -{ - int i; +static pte_t *alloc_split_page(struct page **base) +{ + struct page *p = alloc_page(GFP_KERNEL); + if (!p) + return NULL; + SetPagePrivate(p); + page_private(p) = 0; + *base = p; + return page_address(p); +} + +static struct page *free_split_page(struct page *base) +{ + BUG_ON(!PagePrivate(base)); + BUG_ON(page_private(base) != 0); + ClearPagePrivate(base); + __free_page(base); + return NULL; +} + +static struct page * +split_pmd(unsigned long paddr, pgprot_t prot, pgprot_t ref_prot) +{ + int i; unsigned long addr; - struct page *base = alloc_pages(GFP_KERNEL, 0); - pte_t *pbase; - if (!base) + struct page *base; + pte_t *pbase = alloc_split_page(&base); + if (!pbase) return NULL; - /* -* page_private is used to track the number of entries in -* the page table page have non standard attributes. -*/ - SetPagePrivate(base); - page_private(base) = 0; - address = __pa(address); - addr = address & PMD_PAGE_MASK; - pbase = (pte_t *)page_address(base); - for (i = 0; i < PTRS_PER_PTE; i++, addr += PAGE_SIZE) { - pbase[i] = pfn_pte(addr >> PAGE_SHIFT, - addr == address ? prot : ref_prot); + addr = paddr & PMD_PAGE_MASK; + for (i = 0; i < PTRS_PER_PTE; i++, addr += PAGE_SIZE) + pbase[i] = pfn_pte(addr >> PAGE_SHIFT, + addr == paddr ? prot : ref_prot); + + return base; +} + +static struct page * +split_gb(unsigned long paddr, pgprot_t prot, pgprot_t ref_prot) +{ + unsigned long addr; + int i; + struct page *base; + pte_t *pbase = alloc_split_page(&base); + + if (!pbase) + return NULL; + addr = paddr & PUD_PAGE_MASK; + for (i = 0; i < PTRS_PER_PMD; i++, addr += PMD_PAGE_SIZE) { + if (paddr >= addr && paddr < addr + PMD_PAGE_SIZE) { + struct page *l3; + l3 = split_pmd(paddr, prot, ref_prot); + if (!l3) + return free_split_page(base); + page_private(l3)++; + pbase[i] = mk_pte(l3, ref_prot); + } else { + pbase[i] = pfn_pte(addr>>PAGE_SHIFT, ref_prot); + pbase[i] = pte_mkhuge(pbase[i]); + } } return base; +} + +static struct page *split_large_page(unsigned long address, pgprot_t prot, +pgprot_t ref_prot, int level) +{ + unsigned long paddr = __pa(address); + if (level == 2) + return split_gb(paddr, prot, ref_prot); + else if (level == 3) + return split_pmd(paddr, prot, ref_prot); + else { + printk("address %lx\n", address); + dump_pagetable(address); + BUG(); + } + return NULL; } struct flush_arg { @@ -132,17 +190,40 @@ static inline void save_page(struct page list_add(&fpage->lru, &deferred_pages); } +static void reset_large_pte(pte_t *pte, unsigned long addr, pgprot_t prot) +{ + unsigned long pfn = __pa(addr) >> PAGE_SHIFT; + set_pte(pte, pte_mkhuge(pfn_pte(pfn, prot))); +} + +static void +revert_gb(unsigned long address, pud_t *pud, pmd_t *pmd, pgprot_t ref_prot) +{ + struct page *p = virt_to_page(pmd); + + /* Reserved pages have been already set up at boot. Don't touch those. */ + if (PageReserved(p)) + return; + + --page_private(p); + BUG_ON(page_private(p) < 0); + if (page_private(p) == 0) { + save_page(p); + reset_large_pte((pte_t *)pud, address & PUD_PAGE_MASK, + ref_prot); + } +} + /* * No more special protections in this
[PATCH] [8/8] GBPAGES: Do kernel direct mapping at boot using GB pages
This should decrease TLB pressure because the kernel will need less TLB faults for its own data access. Only done for 64bit because i386 does not support GB page tables. This only applies to the data portion of the direct mapping; the kernel text mapping stays with 2MB pages because the AMD Fam10h microarchitecture does not support GB ITLBs and AMD recommends against using GB mappings for code. Can be disabled with direct_gbpages=off Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86/mm/init_64.c | 63 ++ 1 file changed, 54 insertions(+), 9 deletions(-) Index: linux/arch/x86/mm/init_64.c === --- linux.orig/arch/x86/mm/init_64.c +++ linux/arch/x86/mm/init_64.c @@ -268,13 +268,20 @@ void early_iounmap(void *addr, unsigned __flush_tlb(); } +static unsigned long direct_entry(unsigned long paddr) +{ + unsigned long entry; + entry = __PAGE_KERNEL_LARGE|paddr; + entry &= __supported_pte_mask; + return entry; +} + static void __meminit phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end) { int i = pmd_index(address); for (; i < PTRS_PER_PMD; i++, address += PMD_SIZE) { - unsigned long entry; pmd_t *pmd = pmd_page + pmd_index(address); if (address >= end) { @@ -287,9 +294,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned if (pmd_val(*pmd)) continue; - entry = __PAGE_KERNEL_LARGE|_PAGE_GLOBAL|address; - entry &= __supported_pte_mask; - set_pmd(pmd, __pmd(entry)); + set_pmd(pmd, __pmd(direct_entry(address))); } } @@ -317,7 +322,13 @@ static void __meminit phys_pud_init(pud_ break; if (pud_val(*pud)) { - phys_pmd_update(pud, addr, end); + if (!pud_large(*pud)) + phys_pmd_update(pud, addr, end); + continue; + } + + if (direct_gbpages > 0) { + set_pud(pud, __pud(direct_entry(addr))); continue; } @@ -336,9 +347,11 @@ static void __init find_early_table_spac unsigned long puds, pmds, tables, start; puds = (end + PUD_SIZE - 1) >> PUD_SHIFT; - pmds = (end + PMD_SIZE - 1) >> PMD_SHIFT; - tables = round_up(puds * sizeof(pud_t), PAGE_SIZE) + -round_up(pmds * sizeof(pmd_t), PAGE_SIZE); + tables = round_up(puds * sizeof(pud_t), PAGE_SIZE); + if (!direct_gbpages) { + pmds = (end + PMD_SIZE - 1) >> PMD_SHIFT; + tables += round_up(pmds * sizeof(pmd_t), PAGE_SIZE); + } /* RED-PEN putting page tables only on node 0 could cause a hotspot and fill up ZONE_DMA. The page tables @@ -373,8 +386,15 @@ void __init_refok init_memory_mapping(un * mapped. Unfortunately this is done currently before the nodes are * discovered. */ - if (!after_bootmem) + if (!after_bootmem) { + if (direct_gbpages >= 0 && cpu_has_gbpages) { + printk(KERN_INFO "Using GB pages for direct mapping\n"); + direct_gbpages = 1; + } else + direct_gbpages = 0; + find_early_table_space(end); + } start = (unsigned long)__va(start); end = (unsigned long)__va(end); @@ -423,6 +443,27 @@ void __init paging_init(void) } #endif +static void split_gb_page(pud_t *pud, unsigned long paddr) +{ + int i; + pmd_t *pmd; + struct page *p = alloc_page(GFP_KERNEL); + if (!p) + return; + + Dprintk("split_gb_page %lx\n", paddr); + + SetPagePrivate(p); + /* Set reference to 1 so that c_p_a() does not undo it */ + page_private(p) = 1; + + paddr &= PUD_PAGE_MASK; + pmd = page_address(p); + for (i = 0; i < PTRS_PER_PTE; i++, paddr += PMD_PAGE_SIZE) + pmd[i] = __pmd(direct_entry(paddr)); + pud_populate(NULL, pud, pmd); +} + /* Unmap a kernel mapping if it exists. This is useful to avoid prefetches from the CPU leading to inconsistent cache lines. address and size must be aligned to 2MB boundaries. @@ -434,6 +475,8 @@ __clear_kernel_mapping(unsigned long add BUG_ON(address & ~PMD_PAGE_MASK); BUG_ON(size & ~PMD_PAGE_MASK); + + Dprintk("clear_kernel_mapping %lx-%lx\n", address, address+size); for (; address < end; address += PMD_PAGE_SIZE) { pgd_t *pgd = pgd_offset_k(address); @@ -442,6 +485,8 @@ __clear_kernel_mapping(unsigned long add if (pgd_none(*pgd)) continue; pud = pud_offset(pgd, address); + if (
[PATCH] [6/8] Add an option to disable direct mapping gbpages and a global variable
Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- Documentation/x86_64/boot-options.txt |3 +++ arch/x86/mm/init_64.c | 12 include/asm-x86/pgtable_64.h |2 ++ 3 files changed, 17 insertions(+) Index: linux/arch/x86/mm/init_64.c === --- linux.orig/arch/x86/mm/init_64.c +++ linux/arch/x86/mm/init_64.c @@ -57,6 +57,18 @@ static unsigned long dma_reserve __initd DEFINE_PER_CPU(struct mmu_gather, mmu_gathers); +int direct_gbpages; + +static int __init parse_direct_gbpages(char *arg) +{ + if (!strcmp(arg, "off")) { + direct_gbpages = -1; + return 0; + } + return -1; +} +early_param("direct_gbpages", parse_direct_gbpages); + /* * NOTE: pagetable_init alloc all the fixmap pagetables contiguous on the * physical space so we can cache the place of the first one and move Index: linux/include/asm-x86/pgtable_64.h === --- linux.orig/include/asm-x86/pgtable_64.h +++ linux/include/asm-x86/pgtable_64.h @@ -248,6 +248,8 @@ static inline int pud_large(pud_t pte) #define update_mmu_cache(vma,address,pte) do { } while (0) +extern int direct_gbpages; + /* Encode and de-code a swap entry */ #define __swp_type(x) (((x).val >> 1) & 0x3f) #define __swp_offset(x)((x).val >> 8) Index: linux/Documentation/x86_64/boot-options.txt === --- linux.orig/Documentation/x86_64/boot-options.txt +++ linux/Documentation/x86_64/boot-options.txt @@ -307,3 +307,6 @@ Debugging stuck (default) Miscellaneous + + direct_gbpages=off + Do not use GB pages for kernel direct mapping. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [5/8] GBPAGES: Support gbpages in pagetable dump
Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86/mm/fault_64.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Index: linux/arch/x86/mm/fault_64.c === --- linux.orig/arch/x86/mm/fault_64.c +++ linux/arch/x86/mm/fault_64.c @@ -200,7 +200,8 @@ void dump_pagetable(unsigned long addres pud = pud_offset(pgd, address); if (bad_address(pud)) goto bad; printk("PUD %lx ", pud_val(*pud)); - if (!pud_present(*pud)) goto ret; + if (!pud_present(*pud) || pud_large(*pud)) + goto ret; pmd = pmd_offset(pud, address); if (bad_address(pmd)) goto bad; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [3/8] GBPAGES: Split LARGE_PAGE_SIZE/MASK into PUD_PAGE_SIZE/PMD_PAGE_SIZE
Split the existing LARGE_PAGE_SIZE/MASK macro into two new macros PUD_PAGE_SIZE/MASK and PMD_PAGE_SIZE/MASK. Fix up all callers to use the new names. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86/boot/compressed/head_64.S |8 arch/x86/kernel/head_64.S |4 ++-- arch/x86/kernel/pci-gart_64.c |2 +- arch/x86/mm/init_64.c |6 +++--- arch/x86/mm/pageattr_64.c |4 ++-- include/asm-x86/page.h |4 ++-- include/asm-x86/page_32.h |4 include/asm-x86/page_64.h |3 +++ 8 files changed, 21 insertions(+), 14 deletions(-) Index: linux/include/asm-x86/page_64.h === --- linux.orig/include/asm-x86/page_64.h +++ linux/include/asm-x86/page_64.h @@ -23,6 +23,9 @@ #define MCE_STACK 5 #define N_EXCEPTION_STACKS 5 /* hw limit: 7 */ +#define PUD_PAGE_SIZE (_AC(1, UL) << PUD_SHIFT) +#define PUD_PAGE_MASK (~(PUD_PAGE_SIZE-1)) + #define __PAGE_OFFSET _AC(0x8100, UL) #define __PHYSICAL_START CONFIG_PHYSICAL_START Index: linux/arch/x86/boot/compressed/head_64.S === --- linux.orig/arch/x86/boot/compressed/head_64.S +++ linux/arch/x86/boot/compressed/head_64.S @@ -80,8 +80,8 @@ startup_32: #ifdef CONFIG_RELOCATABLE movl%ebp, %ebx - addl$(LARGE_PAGE_SIZE -1), %ebx - andl$LARGE_PAGE_MASK, %ebx + addl$(PMD_PAGE_SIZE -1), %ebx + andl$PMD_PAGE_MASK, %ebx #else movl$CONFIG_PHYSICAL_START, %ebx #endif @@ -220,8 +220,8 @@ ENTRY(startup_64) /* Start with the delta to where the kernel will run at. */ #ifdef CONFIG_RELOCATABLE leaqstartup_32(%rip) /* - $startup_32 */, %rbp - addq$(LARGE_PAGE_SIZE - 1), %rbp - andq$LARGE_PAGE_MASK, %rbp + addq$(PMD_PAGE_SIZE - 1), %rbp + andq$PMD_PAGE_MASK, %rbp movq%rbp, %rbx #else movq$CONFIG_PHYSICAL_START, %rbp Index: linux/arch/x86/kernel/pci-gart_64.c === --- linux.orig/arch/x86/kernel/pci-gart_64.c +++ linux/arch/x86/kernel/pci-gart_64.c @@ -501,7 +501,7 @@ static __init unsigned long check_iommu_ } a = aper + iommu_size; - iommu_size -= round_up(a, LARGE_PAGE_SIZE) - a; + iommu_size -= round_up(a, PMD_PAGE_SIZE) - a; if (iommu_size < 64*1024*1024) { printk(KERN_WARNING Index: linux/arch/x86/kernel/head_64.S === --- linux.orig/arch/x86/kernel/head_64.S +++ linux/arch/x86/kernel/head_64.S @@ -63,7 +63,7 @@ startup_64: /* Is the address not 2M aligned? */ movq%rbp, %rax - andl$~LARGE_PAGE_MASK, %eax + andl$~PMD_PAGE_MASK, %eax testl %eax, %eax jnz bad_address @@ -88,7 +88,7 @@ startup_64: /* Add an Identity mapping if I am above 1G */ leaq_text(%rip), %rdi - andq$LARGE_PAGE_MASK, %rdi + andq$PMD_PAGE_MASK, %rdi movq%rdi, %rax shrq$PUD_SHIFT, %rax Index: linux/arch/x86/mm/init_64.c === --- linux.orig/arch/x86/mm/init_64.c +++ linux/arch/x86/mm/init_64.c @@ -420,10 +420,10 @@ __clear_kernel_mapping(unsigned long add { unsigned long end = address + size; - BUG_ON(address & ~LARGE_PAGE_MASK); - BUG_ON(size & ~LARGE_PAGE_MASK); + BUG_ON(address & ~PMD_PAGE_MASK); + BUG_ON(size & ~PMD_PAGE_MASK); - for (; address < end; address += LARGE_PAGE_SIZE) { + for (; address < end; address += PMD_PAGE_SIZE) { pgd_t *pgd = pgd_offset_k(address); pud_t *pud; pmd_t *pmd; Index: linux/arch/x86/mm/pageattr_64.c === --- linux.orig/arch/x86/mm/pageattr_64.c +++ linux/arch/x86/mm/pageattr_64.c @@ -70,7 +70,7 @@ static struct page *split_large_page(uns page_private(base) = 0; address = __pa(address); - addr = address & LARGE_PAGE_MASK; + addr = address & PMD_PAGE_MASK; pbase = (pte_t *)page_address(base); for (i = 0; i < PTRS_PER_PTE; i++, addr += PAGE_SIZE) { pbase[i] = pfn_pte(addr >> PAGE_SHIFT, @@ -150,7 +150,7 @@ static void revert_page(unsigned long ad BUG_ON(pud_none(*pud)); pmd = pmd_offset(pud, address); BUG_ON(pmd_val(*pmd) & _PAGE_PSE); - pfn = (__pa(address) & LARGE_PAGE_MASK) >> PAGE_SHIFT; + pfn = (__pa(address) & PMD_PAGE_MASK) >> PAGE_SHIFT; large_pte = pfn_pte(pfn, ref_prot); large_pte = pte_mkhuge(large_pte); set_pte((pte_t *)pmd, large_pte); Index: linux/include/asm-x86/page_32.h ===
[PATCH] [4/8] Add pgtable accessor functions for GB pages
Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- include/asm-x86/pgtable_64.h |6 ++ 1 file changed, 6 insertions(+) Index: linux/include/asm-x86/pgtable_64.h === --- linux.orig/include/asm-x86/pgtable_64.h +++ linux/include/asm-x86/pgtable_64.h @@ -208,6 +208,12 @@ static inline unsigned long pmd_bad(pmd_ #define pud_offset(pgd, address) ((pud_t *) pgd_page_vaddr(*(pgd)) + pud_index(address)) #define pud_present(pud) (pud_val(pud) & _PAGE_PRESENT) +static inline int pud_large(pud_t pte) +{ + return (pud_val(pte) & (_PAGE_PSE|_PAGE_PRESENT)) == + (_PAGE_PSE|_PAGE_PRESENT); +} + /* PMD - Level 2 access */ #define pmd_page_vaddr(pmd) ((unsigned long) __va(pmd_val(pmd) & PTE_MASK)) #define pmd_page(pmd) (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT)) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [0/8] GBpages support for x86-64, v2
This patch series supports using the new GB pages introduced with AMD Quad Cores for the kernel direct mapping. I addressed all reasonable feedback for the previous version I believe. Changes against previous version: - Ported on top of latest git-x86 with PAT series - Fixed some white space - Clarify clear_kernel_mapping comments - Minor cleanups -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [1/8] Handle kernel near memory hole in clear_kernel_mapping
This was a long standing obscure problem in the relocatable kernel. The AMD GART driver needs to unmap part of the GART in the kernel direct mapping to prevent cache corruption. With the relocatable kernel it is in theory possible that the separate kernel text mapping straddles that area too. Normally it should not happen because GART tends to be >= 2GB, and the kernel is normally not loaded that high, but it is possible in theory. Teach clear_kernel_mapping() about this case. This will become more important once the kernel mapping uses 1GB pages. Cc: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86/mm/init_64.c | 25 - 1 file changed, 24 insertions(+), 1 deletion(-) Index: linux/arch/x86/mm/init_64.c === --- linux.orig/arch/x86/mm/init_64.c +++ linux/arch/x86/mm/init_64.c @@ -415,7 +415,8 @@ void __init paging_init(void) from the CPU leading to inconsistent cache lines. address and size must be aligned to 2MB boundaries. Does nothing when the mapping doesn't exist. */ -void __init clear_kernel_mapping(unsigned long address, unsigned long size) +static void __init +__clear_kernel_mapping(unsigned long address, unsigned long size) { unsigned long end = address + size; @@ -445,6 +446,28 @@ void __init clear_kernel_mapping(unsigne __flush_tlb_all(); } +#define overlaps(as, ae, bs, be) ((ae) >= (bs) && (as) <= (be)) + +void __init clear_kernel_mapping(unsigned long address, unsigned long size) +{ + int sh = PMD_SHIFT; + unsigned long kernel = __pa(__START_KERNEL_map); + + /* +* Note that we cannot unmap the kernel itself because the unmapped +* holes here are always at least 2MB aligned. +* This just applies to the trailing areas of the 40MB kernel mapping. +*/ + if (overlaps(kernel >> sh, (kernel + KERNEL_TEXT_SIZE) >> sh, + __pa(address) >> sh, __pa(address + size) >> sh)) { + printk(KERN_WARNING + "Kernel mapping at %lx within 2MB of memory hole\n", + kernel); + __clear_kernel_mapping(__START_KERNEL_map+__pa(address), size); + } + __clear_kernel_mapping(address, size); +} + /* * Memory hotplug specific functions */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [2/8] GBPAGES: Add feature macros for the gbpages cpuid bit
Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- include/asm-x86/cpufeature.h |2 ++ 1 file changed, 2 insertions(+) Index: linux/include/asm-x86/cpufeature.h === --- linux.orig/include/asm-x86/cpufeature.h +++ linux/include/asm-x86/cpufeature.h @@ -49,6 +49,7 @@ #define X86_FEATURE_MP (1*32+19) /* MP Capable. */ #define X86_FEATURE_NX (1*32+20) /* Execute Disable */ #define X86_FEATURE_MMXEXT (1*32+22) /* AMD MMX extensions */ +#define X86_FEATURE_GBPAGES(1*32+26) /* GB pages */ #define X86_FEATURE_RDTSCP (1*32+27) /* RDTSCP */ #define X86_FEATURE_LM (1*32+29) /* Long Mode (x86-64) */ #define X86_FEATURE_3DNOWEXT (1*32+30) /* AMD 3DNow! extensions */ @@ -173,6 +174,7 @@ #define cpu_has_btsboot_cpu_has(X86_FEATURE_BTS) #define cpu_has_patboot_cpu_has(X86_FEATURE_PAT) #define cpu_has_ss boot_cpu_has(X86_FEATURE_SELFSNOOP) +#define cpu_has_gbpagesboot_cpu_has(X86_FEATURE_GBPAGES) #if defined(CONFIG_X86_INVLPG) || defined(CONFIG_X86_64) # define cpu_has_invlpg1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] Relax restrictions on setting CONFIG_NUMA on x86
Mel Gorman <[EMAIL PROTECTED]> writes: > A fix[1] was merged to the x86.git tree that allowed NUMA kernels to boot > on normal x86 machines (and not just NUMA-Q, Summit etc.). I took a look > at the restrictions on setting NUMA on x86 to see if they could be lifted. The problem with i386 CONFIG_NUMA previously was not that it didn't boot on normal non NUMA systems, but that it didn't boot on very common NUMA systems: Opterons. Have you tested if that is fixed now? -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
" Change size of node ids from u8 to u16 fixup" causes early panic in __build_all_zonelists
One of my test systems didn't boot with latest git-x86. I bisected it down to f1321f875910172bcc3e1f302fe145a9e4d3bdf7 With later patches the fault seemed to happen even earlier before other initialization messages. Config is available at http://halobates.de/config -Andi commit f1321f875910172bcc3e1f302fe145a9e4d3bdf7 Author: [EMAIL PROTECTED] <[EMAIL PROTECTED]> Date: Fri Jan 18 23:05:33 2008 +0100 x86: Change size of node ids from u8 to u16 fixup Change the size of node ids for X86_64 from 8 bits to 16 bits to accomodate more than 256 nodes. Introduce a "numanode_t" type for x86-generic usage. ... swsusp: Registered nosave memory region: ff70 - 0001 Allocating PCI resources starting at e200 (gap: e000:1ec0) SMP: Allowing 4 CPUs, 0 hotplug CPUs PERCPU: Allocating 34912 bytes of per cpu data PANIC: early exception 0e rip 10:802602cd error 0 cr2 6b0 Pid: 0, comm: swapper Not tainted 2.6.24-rc8-g3c2d7552 #27 Call Trace: [] __build_all_zonelists+0x2a9/0x40a [] __build_all_zonelists+0xec/0x40a [] build_all_zonelists+0x1a0/0x244 [] start_kernel+0x110/0x2bd [] _sinittext+0x1c6/0x1cd RIP 0x10 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RUSAGE_THREAD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Roland McGrath wrote: > +#define RUSAGE_LWP RUSAGE_THREAD /* Solaris name for same */ No need to clutter the kernel header with this, it'll be in the libc header. Aside from that: Acked-by: Ulrich Drepper <[EMAIL PROTECTED]> - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org iD8DBQFHkZbk2ijCOnn/RHQRAtohAKCyWgJsm20LSqxTznvff3LI8zplvgCgwttu 16eJFNgQXWNEk76b141uZvo= =DzhA -END PGP SIGNATURE- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] x86: Change size of node ids from u8 to u16 fixup
On Jan 18, 2008 9:17 PM, David Rientjes <[EMAIL PROTECTED]> wrote: > On Fri, 18 Jan 2008, Yinghai Lu wrote: > > > > > I got > > > > SART: PXM 0 -> APIC 0 -> Node 255 > > > > SART: PXM 0 -> APIC 1 -> Node 255 > > > > SART: PXM 1 -> APIC 2 -> Node 255 > > > > SART: PXM 1 -> APIC 3 -> Node 255 > > > > > > > > > > I assume this is a typo and those proximity mappings are actually from the > > > SRAT. > > > > SRAT for processor only have > > PXM and APIC id. setup_node(pxm) will get node id for pxm, start from 0... > > > > I was referring to "SART" in your log. i should copy it instead of type it... YH -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] Development release 0.1 of the LatencyTOP tool
> syscall nr and pid at minimum then. oprofile already supports logging the pid I believe. Otherwise the pid filter in opreport could hardly work. > Still doesn't work for modules either. oprofile works fine for modules. > > what it ends up doing is using an entirely different interface for > basically the > same code / operation inside the kernel. Well rather it uses an existing framework for something that fits it well. Also the way I proposed is very cheap and would be possible to use in production kernels without special configs. > The current interface code is maybe 80 lines of /proc code... and very > simple to > use (unlike the oprofile interface) The oprofile interface is per CPU (so you wouldn't need to reinvent that to fix your locking) and if you add the syscall logging feature to it it would apply to all profile events e.g. you could then do things like "matching cache misses to syscalls" -andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] kernel/params.c: fix the module name length in param_sysfs_builtin
From: Denis Cheng <[EMAIL PROTECTED]> Date: Sat, 19 Jan 2008 13:29:51 +0800 Subject: [PATCH] kernel/params.c: fix the module name length in param_sysfs_builtin the original code use KOBJ_NAME_LEN for built-in module name length, that's defined to 20 in linux/kobject.h, but this is not enough appearntly, many module names are longer than this; #define KOBJ_NAME_LEN 20 another macro is MODULE_NAME_LEN defined in linux/module.h, I think this is enough for module names: #define MODULE_NAME_LEN (64 - sizeof(unsigned long)) Signed-off-by: Denis Cheng <[EMAIL PROTECTED]> --- kernel/params.c |8 +++- 1 files changed, 3 insertions(+), 5 deletions(-) diff --git a/kernel/params.c b/kernel/params.c index 7686417..a085b40 100644 --- a/kernel/params.c +++ b/kernel/params.c @@ -376,8 +376,6 @@ int param_get_string(char *buffer, struct kernel_param *kp) extern struct kernel_param __start___param[], __stop___param[]; -#define MAX_KBUILD_MODNAME KOBJ_NAME_LEN - struct param_attribute { struct module_attribute mattr; @@ -588,7 +586,7 @@ static void __init param_sysfs_builtin(void) { struct kernel_param *kp, *kp_begin = NULL; unsigned int i, name_len, count = 0; - char modname[MAX_KBUILD_MODNAME + 1] = ""; + char modname[MODULE_NAME_LEN + 1] = ""; for (i=0; i < __stop___param - __start___param; i++) { char *dot; @@ -596,12 +594,12 @@ static void __init param_sysfs_builtin(void) kp = &__start___param[i]; max_name_len = - min_t(size_t, MAX_KBUILD_MODNAME, strlen(kp->name)); + min_t(size_t, MODULE_NAME_LEN, strlen(kp->name)); dot = memchr(kp->name, '.', max_name_len); if (!dot) { DEBUGP("couldn't find period in first %d characters " - "of %s\n", MAX_KBUILD_MODNAME, kp->name); + "of %s\n", MODULE_NAME_LEN, kp->name); continue; } name_len = dot - kp->name; -- 1.5.3.5 -- Denis Cheng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] Development release 0.1 of the LatencyTOP tool
Andi Kleen wrote: another thing that the current profiling can't do, is to show what the system is doing when it hits the latency.. so someone calling fsync() will show up in the waiting for IO function, but not that it was due to an fsync(). Hmm so how about extending oprofile to always log the syscall number in the event logs (can be gotten from top of stack). syscall nr and pid at minimum then. Still doesn't work for modules either. what it ends up doing is using an entirely different interface for basically the same code / operation inside the kernel. The current interface code is maybe 80 lines of /proc code... and very simple to use (unlike the oprofile interface) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] Development release 0.1 of the LatencyTOP tool II
On Sat, Jan 19, 2008 at 06:33:30AM +0100, Andi Kleen wrote: > > another thing that the current profiling can't do, is to show what the > > system is doing > > when it hits the latency.. so someone calling fsync() will show up in the > > waiting for > > IO function, but not that it was due to an fsync(). > > Hmm so how about extending oprofile to always log the syscall number > in the event logs (can be gotten from top of stack). I think given Ok to handle exceptions like page faults this way you would need to save the vector somewhere on entry, but that shouldn't be very costly or difficult and could probably even be done unconditionally. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] Development release 0.1 of the LatencyTOP tool
> another thing that the current profiling can't do, is to show what the > system is doing > when it hits the latency.. so someone calling fsync() will show up in the > waiting for > IO function, but not that it was due to an fsync(). Hmm so how about extending oprofile to always log the syscall number in the event logs (can be gotten from top of stack). I think given that you could reconstruct that data in the userland at least for single threads (not for work done on behalf of them in other threads; but I'm not sure you tried to solve that problem at all) The advantage is that it would be an generic mechanism that would work for all types of profiling. -Andi > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] Development release 0.1 of the LatencyTOP tool
Andi Kleen wrote: yes indeed; I sort of use the same infrastructure inside the scheduler; the biggest reason I felt I had to do something different was that I wanted to do per process data collection, so that you can see for a specific process what was going on. Wouldn't it have been easier then to just extend the sleep profiler to oprofile? oprofile already has pid filters and can do per process profiling. it's more complex than that On the other hand I'm not fully sure only doing per pid profiling is that useful. After all often latencies come from asynchronous threads (like kblockd). So a system level view is probably better anyways. another thing that the current profiling can't do, is to show what the system is doing when it hits the latency.. so someone calling fsync() will show up in the waiting for IO function, but not that it was due to an fsync(). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] Development release 0.1 of the LatencyTOP tool
> yes indeed; I sort of use the same infrastructure inside the scheduler; the > biggest > reason I felt I had to do something different was that I wanted to do per > process > data collection, so that you can see for a specific process what was going > on. Wouldn't it have been easier then to just extend the sleep profiler to oprofile? oprofile already has pid filters and can do per process profiling. On the other hand I'm not fully sure only doing per pid profiling is that useful. After all often latencies come from asynchronous threads (like kblockd). So a system level view is probably better anyways. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] x86: Change size of node ids from u8 to u16 fixup
On Fri, 18 Jan 2008, Yinghai Lu wrote: > > > I got > > > SART: PXM 0 -> APIC 0 -> Node 255 > > > SART: PXM 0 -> APIC 1 -> Node 255 > > > SART: PXM 1 -> APIC 2 -> Node 255 > > > SART: PXM 1 -> APIC 3 -> Node 255 > > > > > > > I assume this is a typo and those proximity mappings are actually from the > > SRAT. > > SRAT for processor only have > PXM and APIC id. setup_node(pxm) will get node id for pxm, start from 0... > I was referring to "SART" in your log. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
create a file in kernel mode. help please!
Hi there, obviously this is a newbie question, but I couldn't find any documentation on how to do it.. I tried several ways but couldnt do it. I designed a system call, so a user will call it, and a new file will be created ('/tmp/filexx'). After that, I have another system call, that will map the file into the maps of the user process. The idea is the same as IPC... I managed to create the file with this function (in the first system call): fd = filp_open(path, O_CREAT | O_RDWR , 777); After that, the user will call another system call, and it will map this file to the process maps. something like this: filp_open(route, O_RDWR,0 ); do_mmap(fp, 0, tamano, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_SHARED, 0); After I call the second system call, the user tries to access the memory, but gets the message "Bus Error". I tried to manually create a file with vi, and then use the second system call, and worked perfectly. I could use the shared memory without problems. The problems seems to be in the first system call (with filp_open), when I try to create a new file... Can somebody suggest me something, on how I could fix this issue?? It is very important because it is for a college projects. Greetings, and thanks in advance for the answers. Rafael Sisto - Uruguay.- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] Development release 0.1 of the LatencyTOP tool
Andi Kleen wrote: Arjan van de Ven <[EMAIL PROTECTED]> writes: The Intel Open Source Technology Center is pleased to announce the release of version 0.1 of LatencyTOP, a tool for developers to visualize system latencies. Just for completeness -- Linux already had a way to profile latencies since quite some time. It's little known unfortunately and doesn't work for modules since it's a special mode in the old non modular kernel profiler. You enable CONFIG_SCHEDSTATS and boot with profile=sleep and then you can use the readprofile command to read the data. Information can be reset with echo > /proc/profile There's also a profile=sched to profile the scheduler which works even without CONFIG_SCHEDSTATS yes indeed; I sort of use the same infrastructure inside the scheduler; the biggest reason I felt I had to do something different was that I wanted to do per process data collection, so that you can see for a specific process what was going on. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: Unify printk strings in fault_32|64.c
On Sat, 2008-01-19 at 06:08 +0100, Andi Kleen wrote: > On Saturday 19 January 2008 05:22:29 Harvey Harrison wrote: > > Adding the address of the faulting library missed removing a > > line ending from X86_32. > > > > Also update the shorter printk format for X86_32 in fault_64.c > > to make it easier to se the remaining differences. > > Thanks. I think it was correct initially, but one of the merge steps > with the changing git-x86 caused some hunks to be dropped and the patch > never quite recovered from that. No worries, hoping to get them unified this weekend, should make this easier going forward. Harvey -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: Unify printk strings in fault_32|64.c
On Saturday 19 January 2008 05:22:29 Harvey Harrison wrote: > Adding the address of the faulting library missed removing a > line ending from X86_32. > > Also update the shorter printk format for X86_32 in fault_64.c > to make it easier to se the remaining differences. Thanks. I think it was correct initially, but one of the merge steps with the changing git-x86 caused some hunks to be dropped and the patch never quite recovered from that. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] Development release 0.1 of the LatencyTOP tool
Arjan van de Ven <[EMAIL PROTECTED]> writes: > The Intel Open Source Technology Center is pleased to announce the > release of version 0.1 of LatencyTOP, a tool for developers to visualize > system latencies. Just for completeness -- Linux already had a way to profile latencies since quite some time. It's little known unfortunately and doesn't work for modules since it's a special mode in the old non modular kernel profiler. You enable CONFIG_SCHEDSTATS and boot with profile=sleep and then you can use the readprofile command to read the data. Information can be reset with echo > /proc/profile There's also a profile=sched to profile the scheduler which works even without CONFIG_SCHEDSTATS Latencytop will be probably a little more user friendly though. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH for mm] Remove iBCS support
Hi Andrew, Can you please queue this patch in -mm for .25. It was posted earlier and nobody complained. Thanks, -Andi Remove ibcs2 support in ELF loader too ibcs2 support has never been supported on 2.6 kernels as far as I know, and if it has it must have been an external patch. Anyways, if anybody applies an external patch they could as well readd the ibcs checking code to the ELF loader in the same patch. But there is no reason to keep this code running in all Linux kernels. This will save at least two strcmps each ELF execution. No deprecation period because it could not have been used anyways. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- fs/binfmt_elf.c | 15 +++ 1 file changed, 3 insertions(+), 12 deletions(-) Index: linux/fs/binfmt_elf.c === --- linux.orig/fs/binfmt_elf.c +++ linux/fs/binfmt_elf.c @@ -530,7 +530,6 @@ static int load_elf_binary(struct linux_ unsigned long load_addr = 0, load_bias = 0; int load_addr_set = 0; char * elf_interpreter = NULL; - unsigned char ibcs2_interpreter = 0; unsigned long error; struct elf_phdr *elf_ppnt, *elf_phdata; unsigned long elf_bss, elf_brk; @@ -647,14 +646,6 @@ static int load_elf_binary(struct linux_ if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0') goto out_free_interp; - /* If the program interpreter is one of these two, -* then assume an iBCS2 image. Otherwise assume -* a native linux image. -*/ - if (strcmp(elf_interpreter,"/usr/lib/libc.so.1") == 0 || - strcmp(elf_interpreter,"/usr/lib/ld.so.1") == 0) - ibcs2_interpreter = 1; - /* * The early SET_PERSONALITY here is so that the lookup * for the interpreter happens in the namespace of the @@ -674,7 +665,7 @@ static int load_elf_binary(struct linux_ * switch really is going to happen - do this in * flush_thread(). - akpm */ - SET_PERSONALITY(loc->elf_ex, ibcs2_interpreter); + SET_PERSONALITY(loc->elf_ex, 0); interpreter = open_exec(elf_interpreter); retval = PTR_ERR(interpreter); @@ -725,7 +716,7 @@ static int load_elf_binary(struct linux_ goto out_free_dentry; } else { /* Executables without an interpreter also need a personality */ - SET_PERSONALITY(loc->elf_ex, ibcs2_interpreter); + SET_PERSONALITY(loc->elf_ex, 0); } /* OK, we are done with that, now set up the arg stuff, @@ -748,7 +739,7 @@ static int load_elf_binary(struct linux_ /* Do this immediately, since STACK_TOP as used in setup_arg_pages may depend on the personality. */ - SET_PERSONALITY(loc->elf_ex, ibcs2_interpreter); + SET_PERSONALITY(loc->elf_ex, 0); if (elf_read_implies_exec(loc->elf_ex, executable_stack)) current->personality |= READ_IMPLIES_EXEC; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: crash in kmem_cache_init
On Thu, 17 Jan 2008, Olaf Hering wrote: > On Thu, Jan 17, Olaf Hering wrote: > > > Since -mm boots further, what patch should I try? > > rc8-mm1 crashes as well, l3 passed to reap_alien() is NULL. Sigh. It looks like we need alien cache structures in some cases for nodes that have no memory. We must allocate structures for all nodes regardless if they have allocatable memory or not. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Remove information leak in Linux CIFS client
Fix information leak in CIFS client lookup Putting arbitary file names on lookup failures into the system log is not a good idea, because usually everybody can read dmesg and that is thus an information leak if a directory was read protected. Also changed the error printout for this case to a signed number, because it is normally negative and that makes it easier to read. I'm not sure the message is all that useful anyways. Perhaps it should be just removed completely? Or at least rate limited because it allows to spam the kernel log nicely. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Index: linux/fs/cifs/dir.c === --- linux.orig/fs/cifs/dir.c +++ linux/fs/cifs/dir.c @@ -518,7 +518,7 @@ cifs_lookup(struct inode *parent_dir_ino /* if it was once a directory (but how can we tell?) we could do shrink_dcache_parent(direntry); */ } else { - cERROR(1, ("Error 0x%x on cifs_get_inode_info in lookup of %s", + cERROR(1, ("Error %d on cifs_get_inode_info in lookup of file", rc, full_path)); /* BB special case check for Access Denied - watch security exposure of returning dir info implicitly via different rc -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: crash in kmem_cache_init
On Fri, 18 Jan 2008, Olaf Hering wrote: > calls cache_grow with nodeid 0 > > [c075bbd0] [c00f82d0] .cache_alloc_refill+0x234/0x2c0 > calls cache_grow with nodeid 0 > > [c075bbe0] [c00f7f38] .cache_alloc_node+0x17c/0x1e8 > > calls cache_grow with nodeid 1 > > [c075bbe0] [c00f7d68] .fallback_alloc+0x1a0/0x1f4 Okay that makes sense. You have no node 0 with normal memory but the node assigned to the executing processor is zero (correct?). Thus it needs to fallback to node 1 and that is not possible during bootstrap. You need to run kmem_cache_init() on a cpu on a processor with memory. Or we need to revert the patch which would allocate control structures again for all online nodes regardless if they have memory or not. Does reverting 04231b3002ac53f8a64a7bd142fde3fa4b6808c6 change the situation? (However, we tried this on the other thread without success). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: Celeron Core
> It will relative to not throttling. No it will not. Please reread Dominik's mail I linked to. It explains it clearly. > You made a claim that is -physically impossible- as stated, a claim I've > seen here before and I'm correcting it. If something reduces heat, it > must save power *by the definition of heat and power*. And if you reduce > power usage, you will make your battery last longer. I think the misunderstanding on your side is relative to what there is less heat. Throttling essentially reduces temporary heat spikes on the silicon, but does not make the system overall take less power or generate less heat as measured over a longer time because it will be idle less. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: non-choice related config entries within choice
Hi, On Wed, 16 Jan 2008, Sam Ravnborg wrote: > But one feature I really would like to see is named chocies so we can do > stuff like: > > choice X86_PROCESSOR > > config GENERIC_PROCESSOR > bool "A generic X86 processor" > endchoice > > > ... > > choice PPC_PROCESSOR > > config GENERIC_PROCESSOR > bool "A generic PowerPC processor > > endchoice > > The issue here is that we do not today allow the same config option > to appear if more than one choice. What I have in mind is slightly different, above choices would simply be called PROCESSOR, which would tell kconfig that all choices belong to the same group. bye, Roman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] x86: Change size of node ids from u8 to u16 fixup
On Jan 18, 2008 8:36 PM, David Rientjes <[EMAIL PROTECTED]> wrote: > On Fri, 18 Jan 2008, Yinghai Lu wrote: > > > > +#if MAX_NUMNODES > 256 > > > +typedef u16 numanode_t; > > > +#else > > > +typedef u8 numanode_t; > > > +#endif > > > + > > > #endif /* _LINUX_NUMA_H */ > > > > that is wrong, you can not change pxm_to_node_map from int to u8 or u16. > > > > Yeah, NID_INVAL is negative so no unsigned type will work here, > unfortunately. And that reduces the intended savings of your change since > the smaller type can only be used with a smaller CONFIG_NODES_SHIFT. > > > int acpi_map_pxm_to_node(int pxm) > > { > > int node = pxm_to_node_map[pxm]; > > > > if (node < 0){ > > if (nodes_weight(nodes_found_map) >= MAX_NUMNODES) > > return NID_INVAL; > > node = first_unset_node(nodes_found_map); > > __acpi_map_pxm_to_node(pxm, node); > > node_set(node, nodes_found_map); > > } > > > > return node; > > } > > > > node will will be always 255 or 65535 > > > > Right. > > > please keep that to int. > > > > I got > > SART: PXM 0 -> APIC 0 -> Node 255 > > SART: PXM 0 -> APIC 1 -> Node 255 > > SART: PXM 1 -> APIC 2 -> Node 255 > > SART: PXM 1 -> APIC 3 -> Node 255 > > > > I assume this is a typo and those proximity mappings are actually from the > SRAT. SRAT for processor only have PXM and APIC id. setup_node(pxm) will get node id for pxm, start from 0... > > if (node < 0){ > > if (nodes_weight(nodes_found_map) >= MAX_NUMNODES) > > return NID_INVAL; > > node = first_unset_node(nodes_found_map); > > __acpi_map_pxm_to_node(pxm, node); > > node_set(node, nodes_found_map); > > } YH -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: Celeron Core
On Sat, 2008-01-19 at 05:27 +0100, Andi Kleen wrote: > > So while throttling may be less efficient in terms of watt seconds used > > to compile something than running at full speed, it is incorrect to say > > it uses less power. One machine running for an hour throttled to 50% > > uses less power (and therefore less battery and cooling) than another > > running at full speed for that same hour. > > Not for the same unit of work. If you just run endless loops you > might be true, but most systems don't do that. Yes, most systems idle. > In terms of laptops (or rather in most other systems too) you usually care > about battery life time while the system is mostly idling (waiting > for your key strokes etc.). In this case enabling throttling > as a cpufreq driver will not make your battery last longer. It will relative to not throttling. You made a claim that is -physically impossible- as stated, a claim I've seen here before and I'm correcting it. If something reduces heat, it must save power *by the definition of heat and power*. And if you reduce power usage, you will make your battery last longer. Make any other statement you want about the efficiency of throttling per unit work or the effectiveness of throttling relavite to other methods, just stop repeating the claim that "throttling reduces heat but doesn't save power". It goes against the law of conservation of energy. -- Mathematics is the supreme nostalgia of our time. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] x86: Change size of node ids from u8 to u16 fixup
On Fri, 18 Jan 2008, Yinghai Lu wrote: > > +#if MAX_NUMNODES > 256 > > +typedef u16 numanode_t; > > +#else > > +typedef u8 numanode_t; > > +#endif > > + > > #endif /* _LINUX_NUMA_H */ > > that is wrong, you can not change pxm_to_node_map from int to u8 or u16. > Yeah, NID_INVAL is negative so no unsigned type will work here, unfortunately. And that reduces the intended savings of your change since the smaller type can only be used with a smaller CONFIG_NODES_SHIFT. > int acpi_map_pxm_to_node(int pxm) > { > int node = pxm_to_node_map[pxm]; > > if (node < 0){ > if (nodes_weight(nodes_found_map) >= MAX_NUMNODES) > return NID_INVAL; > node = first_unset_node(nodes_found_map); > __acpi_map_pxm_to_node(pxm, node); > node_set(node, nodes_found_map); > } > > return node; > } > > node will will be always 255 or 65535 > Right. > please keep that to int. > > I got > SART: PXM 0 -> APIC 0 -> Node 255 > SART: PXM 0 -> APIC 1 -> Node 255 > SART: PXM 1 -> APIC 2 -> Node 255 > SART: PXM 1 -> APIC 3 -> Node 255 > I assume this is a typo and those proximity mappings are actually from the SRAT. David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: non-choice related config entries within choice
Hi, On Wed, 16 Jan 2008, Jan Beulich wrote: > now that I finally found time to look into the problems that caused the > patch changing boolean/tristate choice behavior to be reverted I find > that due to the way things worked in the past there are a couple of > cases where config options not really belonging to the choice are inside > the choice scope (drivers/usb/gadget/Kconfig, arch/ppc/Kconfig, and > arch/mips/Kconfig are where I found such cases, and I hope this is a > complete list). > > The question is: Is it intended for this to work the way it used to, or > is it rather reasonable to change these scripts so that stuff dependent > upon the choice selection is being dealt with outside the choice scope? This is really a feature, try it with a visible option there which depends on a choice option. First for the choice type I think it's simpler to just look at the first choice option, anything more complex simply has to specify the type explicitly. The bigger problem is that menu_finalize() is little complex which makes such changes more difficult, basically it does two things (updating the dependencies and generating the menu structure) in one pass and it depends on a specific order, which is nonobvious. I really should clean this up to make it easier to follow what's happening. For now this means the dependency to the choice symbol has to be added a little later right before the call to menu_add_symbol(). bye, Roman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [1/2] Fix some inaccurate comments in MTRR checking code
- is_cpu(INTEL) actually refers only to the MTRR architecture and all AMD CPUs since K7 use the Intel MTRR architecture so the fixup code runs on AMD too. Remove a comment claiming otherwise. [Perhaps is_cpu should be renamed, the name is clearly confusing] - Clarify another incorrect comment. Cc: [EMAIL PROTECTED] Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86/kernel/cpu/mtrr/main.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Index: linux/arch/x86/kernel/cpu/mtrr/main.c === --- linux.orig/arch/x86/kernel/cpu/mtrr/main.c +++ linux/arch/x86/kernel/cpu/mtrr/main.c @@ -640,6 +640,8 @@ early_param("disable_mtrr_trim", disable * Some buggy BIOSes don't setup the MTRRs properly for systems with certain * memory configurations. This routine checks to make sure the MTRRs having * a write back type cover all of the memory the kernel is intending to use. + * [AK: actually it doesn't check that. It just checks that the highest + * MTRR is matching the end of memory. That is not quite the same.] * If not, it'll trim any memory off the end by adjusting end_pfn, removing * it from the kernel's allocation pools, warning the user with an obnoxious * message. @@ -649,7 +651,6 @@ void __init mtrr_trim_uncached_memory(vo unsigned long i, base, size, highest_addr = 0, def, dummy; mtrr_type type; - /* Make sure we only trim uncachable memory on Intel machines */ rdmsr(MTRRdefType_MSR, def, dummy); def &= 0xff; if (!is_cpu(INTEL) || disable_mtrr_trim || def != MTRR_TYPE_UNCACHABLE) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [2/2] Fix MTRR check on AMD systems with > 4GB RAM
Newer AMD systems (since K8RevF) have a magic SYSCFG MSR bit to force WB on memory beyond 4GB. This is not reflected in the standard MTRR MSRs, so the MTRR checking routine would get confused and disable perfectly good RAM beyond 4GB. Implement code for checking that bit. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86/kernel/cpu/mtrr/main.c | 34 ++ 1 file changed, 34 insertions(+) Index: linux/arch/x86/kernel/cpu/mtrr/main.c === --- linux.orig/arch/x86/kernel/cpu/mtrr/main.c +++ linux/arch/x86/kernel/cpu/mtrr/main.c @@ -634,6 +634,37 @@ static int __init disable_mtrr_trim_setu early_param("disable_mtrr_trim", disable_mtrr_trim_setup); #ifdef CONFIG_X86_64 + +/* + * Newer AMD K8s and later CPUs have a special magic MSR way to force WB + * for memory >4GB. Check for that here. + * Note this won't check if the MTRRs < 4GB where the magic bit doesn't + * apply to are wrong, but so far we don't know of any such case in the wild. + */ + +#define Tom2ForceMemTypeWB (1U << 22) +static __init int amd_special_default_mtrr(void) +{ + u32 l, h; + + /* Doesn't apply to memory < 4GB */ + if (end_pfn <= (0x >> PAGE_SHIFT)) + return 0; + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) + return 0; + if (boot_cpu_data.x86 < 0xf || boot_cpu_data.x86 > 0x11) + return 0; + /* In case some hypervisor doesn't pass SYSCFG through */ + if (rdmsr_safe(MSR_K8_SYSCFG, &l, &h) < 0) + return 0; + /* Memory between 4GB and top of mem is forced WB by this magic bit. +* Reserved before K8RevF, but should be zero there. +*/ + if (l & Tom2ForceMemTypeWB) + return 1; + return 0; +} + /** * mtrr_trim_uncached_memory - trim RAM not covered by MTRRs * @@ -667,6 +698,9 @@ void __init mtrr_trim_uncached_memory(vo highest_addr = base + size; } + if (amd_special_default_mtrr()) + return; + if ((highest_addr >> PAGE_SHIFT) < end_pfn) { printk(KERN_WARNING "***\n"); printk(KERN_WARNING " WARNING: likely BIOS bug\n"); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 16/22 -v2] add get_monotonic_cycles
Hi - On Fri, Jan 18, 2008 at 10:55:27PM -0500, Steven Rostedt wrote: > [...] > > All this complexity is to be justified by keeping the raw prev/next > > pointers from being sent to a naive tracer? It seems to me way out of > > proportion. > > Damn, and I just blew away all my marker code for something like this ;-) Sorry! :-) > [...] > We have in sched.c the following marker: > trace_mark(kernel_sched_scheduler, "prev %p next %p", prev, next); Fine so far! > Then Mathieu can add in some code somewhere (or a module, or something) > ret = marker_probe_register("kernel_sched_scheduler", > "prev %p next %p", > pretty_print_sched_switch, NULL); > static void pretty_print_sched_switch(const struct marker *mdata, > void *private_data, > const char *format, ...) > { > [...] > trace_mark(kernel_pretty_print_sched_switch, > "prev_pid %d next_pid %d prev_state %ld", > prev->pid, next->pid, prev->state); > } That marker_probe_register call would need to be done only when the embedded (k_p_p_s_s) marker is actually being used. Otherwise we'd lose all the savings of a dormant sched.c marker by always calling into pretty_print_sched_switch(), whether or not the k_p_p_s_s marker was active. In any case, if the naive tracer agrees to become educated about some of these markers in the form of intermediary functions like that, they need not insist on a second hop through marker territory anyway: static void pretty_print_sched_switch(const struct marker *mdata, void *private_data, const char *format, ...) { [...] lttng_backend_trace(kernel_pretty_print_sched_switch, "prev_pid %d next_pid %d prev_state %ld", prev->pid, next->pid, prev->state); } - FChE -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -v6 2/2] Updating ctime and mtime for memory-mapped files
On Fri, 18 Jan 2008 18:50:03 -0600 Matt Mackall <[EMAIL PROTECTED]> wrote: > On Fri, 2008-01-18 at 17:54 -0500, Rik van Riel wrote: > > Backup programs not seeing an updated mtime is a really big deal. > > And that's fixed with the 4-line approach. > > Reminds me, I've got a patch here for addressing that problem with loop > mounts: > > Writes to loop should update the mtime of the underlying file. > > Signed-off-by: Matt Mackall <[EMAIL PROTECTED]> Acked-by: Rik van Riel <[EMAIL PROTECTED]> -- All rights reversed. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: Celeron Core
> So while throttling may be less efficient in terms of watt seconds used > to compile something than running at full speed, it is incorrect to say > it uses less power. One machine running for an hour throttled to 50% > uses less power (and therefore less battery and cooling) than another > running at full speed for that same hour. Not for the same unit of work. If you just run endless loops you might be true, but most systems don't do that. In terms of laptops (or rather in most other systems too) you usually care about battery life time while the system is mostly idling (waiting for your key strokes etc.). In this case enabling throttling as a cpufreq driver will not make your battery last longer. Also skipping the clocks does not actually safe all very much power compared to the other measures C-states or speedstep do (like dropping voltage) This means enabling it will likely make your laptop battery last shorter. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: Unify printk strings in fault_32|64.c
Adding the address of the faulting library missed removing a line ending from X86_32. Also update the shorter printk format for X86_32 in fault_64.c to make it easier to se the remaining differences. Signed-off-by: Harvey Harrison <[EMAIL PROTECTED]> --- Ingo, trivial printk update after Andi's patches. arch/x86/mm/fault_32.c |2 +- arch/x86/mm/fault_64.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/fault_32.c b/arch/x86/mm/fault_32.c index 793e830..0bd2417 100644 --- a/arch/x86/mm/fault_32.c +++ b/arch/x86/mm/fault_32.c @@ -589,7 +589,7 @@ bad_area_nosemaphore: printk_ratelimit()) { printk( #ifdef CONFIG_X86_32 - "%s%s[%d]: segfault at %lx ip %08lx sp %08lx error %lx\n", + "%s%s[%d]: segfault at %lx ip %08lx sp %08lx error %lx", #else "%s%s[%d]: segfault at %lx ip %lx sp %lx error %lx", #endif diff --git a/arch/x86/mm/fault_64.c b/arch/x86/mm/fault_64.c index 9270a7d..9ac449e 100644 --- a/arch/x86/mm/fault_64.c +++ b/arch/x86/mm/fault_64.c @@ -591,7 +591,7 @@ bad_area_nosemaphore: printk_ratelimit()) { printk( #ifdef CONFIG_X86_32 - "%s%s[%d]: segfault at %08lx ip %08lx sp %08lx error %lx\n", + "%s%s[%d]: segfault at %lx ip %08lx sp %08lx error %lx", #else "%s%s[%d]: segfault at %lx ip %lx sp %lx error %lx", #endif -- 1.5.4.rc3.1118.gf6754c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: Celeron Core
On Sat, 2008-01-19 at 02:15 +0100, Andi Kleen wrote: > On Fri, Jan 18, 2008 at 06:27:57PM -0600, Matt Mackall wrote: > > > > On Fri, 2008-01-18 at 22:11 +0100, Andi Kleen wrote: > > > Chodorenko Michail <[EMAIL PROTECTED]> writes: > > > > > > > I have a laptop "Extensa 5220", with the processor Celeron based on > > > > 'core' > > > > technology. > > > > There is ~ / arch/i386/kernel/cpu/cpufreq/p4-clockmod.c in the kernel > > > > source code > > > > but there's no line identification of my CPU for apply freqency change > > > > need to add a ID line 0х16 > > > > > > Note that driver will likely do clock throttling on your CPU. > > > Using that is usually a bad idea because it does not actually > > > safe power. It's only intended to let the CPU cool down in some > > > situations. > > > > Power consumption is more or less exactly equal to heat production > > (that's where the power goes, after all!), so either clock throttling > > DOES save power or it DOES NOT cool the CPU. > > No actually the way it works on modern x86 CPUs is that the best > strategy for saving power is to do things quickly and then > idle longer. That means on anything that has reasonably > deep sleep modi e.g. on older server/desktop systems things might > be slightly different because they had very little power saving > features enabled, but it's definitely true for all > laptop systems from the last several years. But even > on desktop/server throttling tends to be a bad idea. Dominik is measuring energy expended (watts * seconds) vs work done (CPU cycles). But your claim above is "clock throttling...does not save power [but it lets] the CPU cool down", which talks about power (watts) and heat (also watts, in fact the *very same* watts) and is physically impossible. A CPU turns power into heat. Less heat out implies less power in. So while throttling may be less efficient in terms of watt seconds used to compile something than running at full speed, it is incorrect to say it uses less power. One machine running for an hour throttled to 50% uses less power (and therefore less battery and cooling) than another running at full speed for that same hour. The first machine may take significantly longer to complete its task (or it may not, if the task is reading email or watching video), but that's another matter entirely. And whether it's more or less efficient than other power-saving approaches is also another matter. Throttling does save power. -- Mathematics is the supreme nostalgia of our time. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] Makes lguest's irq handler typesafe
Hello, Rusty Russell wrote: > There are three possibilities: (1) force everyone to use void *, (2) > force > everyone to be type-correct, (3) allow both with some tricks. Currently > we're on (1). For kthread, with only dozens of users, I chose (2) (very > simple, easy to understand). I think for widespread things like timer and > interrupt handlers, I think (3) is the right way to go. Yeah, during transition, we definitely want (3). > I wanted to get this patch out there and see what the reaction was. I > can > do timers next, if that's going to add fuel to the discussion. I think you successfully got a very small sample of possible reactions. Jeff vetoing it (and for good reasons) and me a bit more positive but not quite sold. Yeah, I think we need a good flame war to determine our heading and converting timer shouldn't take too much of your time, right? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] x86: Change size of node ids from u8 to u16 fixup
On Jan 18, 2008 10:30 AM, <[EMAIL PROTECTED]> wrote: > Change the size of node ids for X86_64 from 8 bits to 16 bits > to accomodate more than 256 nodes. > > Introduce a "numanode_t" type for x86-generic usage. > > Cc: Eric Dumazet <[EMAIL PROTECTED]> > Signed-off-by: Mike Travis <[EMAIL PROTECTED]> > Reviewed-by: Christoph Lameter <[EMAIL PROTECTED]> > --- > Fixup: > > Size of memnode.embedded_map needs to be changed to > accomodate 16-bit node ids as suggested by Eric. > > V2->V3: > - changed memnode.embedded_map from [64-16] to [64-8] > (and size comment to 128 bytes) > > V1->V2: > - changed pxm_to_node_map to u16 > - changed memnode map entries to u16 > --- > arch/x86/mm/numa_64.c |2 +- > drivers/acpi/numa.c |2 +- > include/asm-x86/mmzone_64.h |6 +++--- > include/linux/numa.h|6 ++ > 4 files changed, 11 insertions(+), 5 deletions(-) > > --- a/arch/x86/mm/numa_64.c > +++ b/arch/x86/mm/numa_64.c > @@ -88,7 +88,7 @@ static int __init allocate_cachealigned_ > unsigned long pad, pad_addr; > > memnodemap = memnode.embedded_map; > - if (memnodemapsize <= 48) > + if (memnodemapsize <= ARRAY_SIZE(memnode.embedded_map)) > return 0; > > pad = L1_CACHE_BYTES - 1; > --- a/drivers/acpi/numa.c > +++ b/drivers/acpi/numa.c > @@ -38,7 +38,7 @@ ACPI_MODULE_NAME("numa"); > static nodemask_t nodes_found_map = NODE_MASK_NONE; > > /* maps to convert between proximity domain and logical node ID */ > -static int pxm_to_node_map[MAX_PXM_DOMAINS] > +static numanode_t pxm_to_node_map[MAX_PXM_DOMAINS] > = { [0 ... MAX_PXM_DOMAINS - 1] = NID_INVAL }; > static int node_to_pxm_map[MAX_NUMNODES] > = { [0 ... MAX_NUMNODES - 1] = PXM_INVAL }; ...> > #define MAX_NUMNODES(1 << NODES_SHIFT) > > +#if MAX_NUMNODES > 256 > +typedef u16 numanode_t; > +#else > +typedef u8 numanode_t; > +#endif > + > #endif /* _LINUX_NUMA_H */ that is wrong, you can not change pxm_to_node_map from int to u8 or u16. int acpi_map_pxm_to_node(int pxm) { int node = pxm_to_node_map[pxm]; if (node < 0){ if (nodes_weight(nodes_found_map) >= MAX_NUMNODES) return NID_INVAL; node = first_unset_node(nodes_found_map); __acpi_map_pxm_to_node(pxm, node); node_set(node, nodes_found_map); } return node; } node will will be always 255 or 65535 please keep that to int. I got SART: PXM 0 -> APIC 0 -> Node 255 SART: PXM 0 -> APIC 1 -> Node 255 SART: PXM 1 -> APIC 2 -> Node 255 SART: PXM 1 -> APIC 3 -> Node 255 YH -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] Makes lguest's irq handler typesafe
On Saturday 19 January 2008 12:44:52 Tejun Heo wrote: > Tejun Heo wrote: > > so I think the question is "do we want to change all callbacks to > > take native pointer type instead of void pointer?". > > Lemme clarity myself a bit. I'm not saying that we should convert all > at once or literally every callback should be converted. What I'm > saying is whether we're headed that way in general and converting big > ones - timer for example - and getting the conversion agreed upon should > be enough to set the norm. Hi Tejun There are three possibilities: (1) force everyone to use void *, (2) force everyone to be type-correct, (3) allow both with some tricks. Currently we're on (1). For kthread, with only dozens of users, I chose (2) (very simple, easy to understand). I think for widespread things like timer and interrupt handlers, I think (3) is the right way to go. I wanted to get this patch out there and see what the reaction was. I can do timers next, if that's going to add fuel to the discussion. Thanks! Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 16/22 -v2] add get_monotonic_cycles
On Fri, 18 Jan 2008, Frank Ch. Eigler wrote: > > All this complexity is to be justified by keeping the raw prev/next > pointers from being sent to a naive tracer? It seems to me way out of > proportion. Damn, and I just blew away all my marker code for something like this ;-) Actually, you just gave me a great idea that I think can help all of us. OK, Mathieu may not be in total agreement, but I think this is the ultimate compromise. We have in sched.c the following marker: trace_mark(kernel_sched_scheduler, "prev %p next %p", prev, next); Then Mathieu can add in some code somewhere (or a module, or something) ret = marker_probe_register("kernel_sched_scheduler", "prev %p next %p", pretty_print_sched_switch, NULL); static void pretty_print_sched_switch(const struct marker *mdata, void *private_data, const char *format, ...) { va_list ap; struct task_struct *prev; struct task_struct *next; va_start(ap, format); prev = va_arg(ap, typeof(prev)); next = va_arg(ap, typeof(next)); va_end; trace_mark(kernel_pretty_print_sched_switch, "prev_pid %d next_pid %d prev_state %ld", prev->pid, next->pid, prev->state); } Then LTTng on startup could arm the normal kernel_sched_switch code and have the user see the nice one. All without adding any more goo or overhead to the non tracing case, and keeping a few critical markers with enough information to be useful to other tracers! Thoughts? -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 16/22 -v2] add get_monotonic_cycles
Hi - On Fri, Jan 18, 2008 at 06:19:29PM -0500, Mathieu Desnoyers wrote: > [...] > Almost.. I would add : > > static int trace_switch_to_enabled; > > > static inline trace_switch_to(struct task_struct *prev, > > struct task_struct *next) > > { > if (likely(!trace_switch_to_enabled)) > return; > > trace_mark(kernel_schedudule, > > "prev_pid %d next_pid %d prev_state %ld", > > prev->pid, next->pid, prev->pid); > > > > trace_context_switch(prev, next); > > } > > And some code to activate the trace_switch_to_enabled variable (ideally > keeping a refcount). [...] All this complexity is to be justified by keeping the raw prev/next pointers from being sent to a naive tracer? It seems to me way out of proportion. - FChE -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 16/22 -v2] add get_monotonic_cycles
Hi - On Fri, Jan 18, 2008 at 05:49:19PM -0500, Steven Rostedt wrote: > [...] > > But I have not seen a lot of situations where that kind of glue-code was > > needed, so I think it makes sense to keep markers simple to use and > > efficient for the common case. > > > > Then, in this glue-code, we can put trace_mark() and calls to in-kernel > > tracers. > > I'm almost done with the latency tracer work, and there are only a total > of 6 hooks that I needed. > [...] > With the above, we could have this (if this is what I think you are > recommending). [...] > static inline trace_switch_to(struct task_struct *prev, > struct task_struct *next) > { > trace_mark(kernel_schedudule, > "prev_pid %d next_pid %d prev_state %ld", > prev->pid, next->pid, prev->pid); > > trace_context_switch(prev, next); > } I'm afraid I don't see the point in this. You could use one marker for all that data (and force the more naive tracer callbacks to ignore out some of them). You could even use two markers (and force the more naive tracer to attach to only to its favorite subset). But to use a second, different, less efficient, not more configurable tracing hook mechanism in the same logical spot makes no sense to me. - FChE -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] X86: fix typo PAT to X86_PAT
On Fri, Jan 18, 2008 at 10:02:10PM +0100, Ingo Molnar wrote: > > * Dave Jones <[EMAIL PROTECTED]> wrote: > > > > you mean modifies MTRRs? Which code is that? (besides the > > > /proc/mtrr userspace API) > > > > This exclusion is going to be a real pain in the ass for distro > > kernels. It's impossible for example to build a kernel that will now > > support the MTRR-alike registers on the AMD K6/early Cyrix etc and > > also support PAT. > > > > Additionally, given people tend to update their kernels a lot more > > often than they update to a whole new version of X, it means until > > userspace has caught up, we can't ship a kernel with PAT supported, or > > else X gets a lot slower due to the missing mtrr support. > > there's no exclusion enforced right now, and if a CPU is PAT-incapable > (or if the kernel is booted nopat) then the MTRR bits should be usable. > But if we boot with PAT enabled, and Xorg gets /proc/mtrr wrong, we'll > see nasty crashes. If it gets them right, it should all still work just > fine. Is this ok? Then, in a year or two, distros can disable write > support to /proc/mtrr. Hm? A crazy idea just occured to me.. We could make /proc/mtrr an interface to set PAT on a range of memory. This would make it transparently work without any changes in X or anything else that sets them in userspace. Dave -- http://www.codemonkey.org.uk -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] 2.6.23-rc9 kernel panic - simple_map_write+0x4e/0x75
looks very similar to http://marc.info/?l=linux-kernel&m=119759817332220&w=2 http://marc.info/?l=linux-kernel&m=119902059626408&w=2 http://marc.info/?l=linux-kernel&m=119259674826979&w=2 http://lkml.org/lkml/2006/6/14/59 so you should try without CONFIG_MTD_PNC2000 that driver having problems for some time (first report seems 2.6.17 ) - i have found same issue (independently) on 2.6.22 and 2.6.24, too. i have some updated information regarding this driver i will post here very soon, but please confirm if this is the issue here (i`m quite sure it is) regards roland Subject:Re: [BUG] 2.6.23-rc9 kernel panic - simple_map_write+0x4e/0x75 From: Kamalesh Babulal Date: 2007-10-17 7:23:33 Message-ID: 4715B5A5.9050005 () linux ! vnet ! ibm ! com [Download message RAW] Andrew Morton wrote: > On Sat, 13 Oct 2007 12:10:44 +0530 > Kamalesh Babulal <[EMAIL PROTECTED]> wrote: > >> Kernel panic's with following oops message with 2.6.23-rc9 kernel >> >> [ 320.747257] ks0108: ERROR: parport didn't register new device >> [ 320.771314] cfag12864b: ERROR: ks0108 is not initialized >> [ 320.794308] cfag12864bfb: ERROR: cfag12864b is not initialized >> [ 320.820729] BUG: unable to handle kernel paging request at virtual address >> bf00 >> [ 320.857712] printing eip: >> [ 320.872556] *pde = >> [ 320.887577] Oops: 0002 [#1] >> [ 320.902383] SMP >> [ 320.914174] Modules linked in: >> [ 320.929333] CPU:0 >> [ 320.929334] EIP:0060:[]Not tainted VLI >> [ 320.929335] EFLAGS: 00010286 (2.6.23-rc9-1 #1) >> [ 320.982753] EIP is at simple_map_write+0x4e/0x75 >> [ 321.001956] eax: f0f0f0f0 ebx: c1de3f00 ecx: c1de3f00 edx: c1de3f00 >> [ 321.027701] esi: c3ca8d6c edi: bf00 ebp: c3ca8d98 esp: c3ca8d6c >> [ 321.053322] ds: 007b es: 007b fs: 00d8 gs: ss: 0068 >> [ 321.075981] Process swapper (pid: 1, ti=c3ca8000 task=f7f44000 >> task.ti=c3ca8000) >> [ 321.103031] Stack: f0f0f0f0 >> >> [ 321.139446]c3ca8e20 0001 c3ca8e40 c3ca8e6c c0d692e6 f0f0f0f0 >> >> [ 321.176495] >> 50e6 >> [ 321.214141] Call Trace: >> [ 321.233922] [] show_trace_log_lvl+0x19/0x2e >> [ 321.255433] [] show_stack_log_lvl+0x99/0xa1 >> [ 321.276706] [] show_registers+0x1b8/0x290 >> [ 321.297254] [] die+0x118/0x1fd >> [ 321.314920] [] do_page_fault+0x51c/0x5f3 >> [ 321.335291] [] error_code+0x72/0x78 >> [ 321.354413] [] cfi_probe_chip+0x148/0x9e1 >> [ 321.375202] [] genprobe_new_chip+0x82/0x98 >> [ 321.396298] [] genprobe_ident_chips+0x26/0x205 >> [ 321.418493] [] mtd_do_chip_probe+0x10/0x97 >> [ 321.439654] [] cfi_probe+0xd/0xf >> [ 321.458157] [] do_map_probe+0x40/0x53 >> [ 321.477931] [] init_pnc2000+0x3b/0x6d >> [ 321.497559] [] do_initcalls+0x7a/0x1c2 >> [ 321.517377] [] do_basic_setup+0x1c/0x1e >> [ 321.537327] [] kernel_init+0x69/0xaa >> [ 321.556311] [] kernel_thread_helper+0x7/0x10 >> [ 321.577207] === >> [ 321.592882] Code: 83 f8 01 75 0a 03 7b 10 8b 45 d4 88 07 eb 35 83 f8 02 >> 75 0c >> 0f b7 45 d4 03 7b 10 66 89 07 eb 24 83 f8 04 75 0a 03 7b 10 8b 45 d4 <89> 07 >> eb >> 15 7e 13 03 7b 10 89 c1 c1 e9 02 f3 a5 89 c1 83 e1 03 >> [ 321.668990] EIP: [] simple_map_write+0x4e/0x75 SS:ESP >> 0068:c3ca8d6c >> [ 321.695750] Kernel panic - not syncing: Attempted to kill init! > > Would I be correct in assuming that the machine has no mtd devices, but > you happened to link that driver into your vmlinux? > Hi Andrew, The machine do not have the mtd device, and the mtd is compiled into the vmlinuz. This configuration works fine for other kernels and is reproducible with 2.6.23-rc9 only. -- Thanks & Regards, Kamalesh Babulal, Linux Technology Center, IBM, ISTL. _ Unbegrenzter Speicherplatz für Ihr E-Mail Postfach? Jetzt aktivieren! http://www.digitaledienste.web.de/freemail/club/lp/?lp=7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] Converting writeback linked lists to a tree based data structure
On Fri, Jan 18, 2008 at 01:41:33PM +0800, Fengguang Wu wrote: > > That is, think of large file writes like process scheduler batch > > jobs - bulk throughput is what matters, so the larger the time slice > > you give them the higher the throughput. > > > > IMO, the sort of result we should be looking at is a > > writeback design that results in cycling somewhat like: > > > > slice 1: iterate over small files > > slice 2: flush large file 1 > > slice 3: iterate over small files > > slice 4: flush large file 2 > > .. > > slice n-1: flush large file N > > slice n: iterate over small files > > slice n+1: flush large file N+1 > > > > So that we keep the disk busy with a relatively fair mix of > > small and large I/Os while both are necessary. > > If we can sync fast enough, the lower layer would be able to merge > those 4MB requests. No, not necessarily - think of a stripe with a chunk size of 512k. That 4MB will be split into 8x512k chunks and sent to different devices (and hence elevator queues). The only way you get elevator merging in this sort of config is that if you send multiple stripe *width* sized amounts to the device in a very short time period. I see quite a few filesystems with stripe widths in the tens of MB range. > > Put simply: > > > > The higher the bandwidth of the device, the more frequently > > we need to be servicing the inodes with large amounts of > > dirty data to be written to maintain write throughput at a > > significant percentage of the device capability. > > > > The writeback algorithm needs to take this into account for it > > to be able to scale effectively for high throughput devices. > > Slow queues go full first. Currently the writeback code will skip > _and_ congestion_wait() for congested filesystems. The better policy > is to congestion_wait() _after_ all other writable pages have been > synced. Agreed. The comments I've made are mainly concerned with getting efficient flushing of a single device occuring. Interactions between multiple devices are a separable issue Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text
> > I think it should be enabled on AMD too though. If the reordering breaks > > it then blacklisting won't help anyways. Actually it is already enabled on AMD. You check for is_cpu(INTEL) but that just checks the generic MTRR architecture and all AMD CPUs since K7 use that one too. That is ok imho. Perhaps it would be good to fix the incorrect comment though. > > > > -Andi > > > > [1] but I checked the known errata and there was nothing related to MTRR. > > Ah, ok, that explains your reticence earlier. Thanks for testing again, I > guess the patch is good to go. I see a failure here now on a (AMD) system where it trims a lot of memory, but should probably not (or at least i haven't noticed any malfunction before without it). Investigating. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
something odd in emu10k1/emufx
In copy_tlv() we have tlv = kmalloc(data[1] * 4 + sizeof(data), GFP_KERNEL); if (!tlv) return NULL; memcpy(tlv, data, sizeof(data)); if (copy_from_user(tlv + 2, _tlv + 2, data[1])) { kfree(tlv); return NULL; } which looks rather odd, since either we kmalloc too much or copy too little... Comments? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/6] export __supported_pte_mask
export __supported_pte_mask variable as GPL symbol. lguest is a user of it. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- arch/x86/kernel/setup64.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/setup64.c b/arch/x86/kernel/setup64.c index 8fa0de8..5cc1339 100644 --- a/arch/x86/kernel/setup64.c +++ b/arch/x86/kernel/setup64.c @@ -41,6 +41,8 @@ struct desc_ptr idt_descr = { 256 * 16 - 1, (unsigned long) idt_table }; char boot_cpu_stack[IRQSTACKSIZE] __attribute__((section(".bss.page_aligned"))); unsigned long __supported_pte_mask __read_mostly = ~0UL; +EXPORT_SYMBOL_GPL(__supported_pte_mask); + static int do_not_nx __cpuinitdata = 0; /* noexec=on|off -- 1.5.0.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/6] export check_tsc_unstable
Exporrt check_tsc_unstable function as GPL symbol. lguest is a user of it. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- arch/x86/kernel/tsc_64.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/tsc_64.c b/arch/x86/kernel/tsc_64.c index c62f3b6..947554d 100644 --- a/arch/x86/kernel/tsc_64.c +++ b/arch/x86/kernel/tsc_64.c @@ -92,10 +92,12 @@ sched_clock(void) __attribute__((alias("native_sched_clock"))); static int tsc_unstable; -inline int check_tsc_unstable(void) +int check_tsc_unstable(void) { return tsc_unstable; } +EXPORT_SYMBOL_GPL(check_tsc_unstable); + #ifdef CONFIG_CPU_FREQ /* Frequency scaling support. Adjust the TSC based timer when the cpu frequency -- 1.5.0.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/6] use __PAGE_KERNEL instead of _PAGE_KERNEL
x86_64 don't expose the intermediate representation with one underline, _PAGE_KERNEL, just the double-underlined one. Use it, to get a common ground between 32 and 64-bit Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- drivers/lguest/page_tables.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/lguest/page_tables.c b/drivers/lguest/page_tables.c index 399c05d..fb5ebd0 100644 --- a/drivers/lguest/page_tables.c +++ b/drivers/lguest/page_tables.c @@ -645,7 +645,7 @@ void map_switcher_in_guest(struct lg_cpu *cpu, struct lguest_pages *pages) /* Make the last PGD entry for this Guest point to the Switcher's PTE * page for this CPU (with appropriate flags). */ - switcher_pgd = __pgd(__pa(switcher_pte_page) | _PAGE_KERNEL); + switcher_pgd = __pgd(__pa(switcher_pte_page) | __PAGE_KERNEL); cpu->lg->pgdirs[cpu->cpu_pgd].pgdir[SWITCHER_PGD_INDEX] = switcher_pgd; @@ -657,7 +657,7 @@ void map_switcher_in_guest(struct lg_cpu *cpu, struct lguest_pages *pages) * page is already mapped there, we don't have to copy them out * again. */ pfn = __pa(cpu->regs_page) >> PAGE_SHIFT; - regs_pte = pfn_pte(pfn, __pgprot(_PAGE_KERNEL)); + regs_pte = pfn_pte(pfn, __pgprot(__PAGE_KERNEL)); switcher_pte_page[(unsigned long)pages/PAGE_SIZE%PTRS_PER_PTE] = regs_pte; } /*:*/ -- 1.5.0.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/6] explicitly use sched.h include
This patch adds the sched.h header explicitly to lguest_user file, and avoid depending on it being included somewhere else. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- drivers/lguest/lguest_user.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/lguest/lguest_user.c b/drivers/lguest/lguest_user.c index a87fca6..85d42d3 100644 --- a/drivers/lguest/lguest_user.c +++ b/drivers/lguest/lguest_user.c @@ -6,6 +6,7 @@ #include #include #include +#include #include "lg.h" /*L:055 When something happens, the Waker process needs a way to stop the -- 1.5.0.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/6] explicitly use hrtimer.h include
This patch adds the hrtimer.h header explicitly to lg.h file, and avoid depending on it being included somewhere else. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- drivers/lguest/lg.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/lguest/lg.h b/drivers/lguest/lg.h index f9707cf..eb51fc2 100644 --- a/drivers/lguest/lg.h +++ b/drivers/lguest/lg.h @@ -8,6 +8,7 @@ #include #include #include +#include #include #include -- 1.5.0.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/6] explicitly use ktime.h include
This patch adds the ktime.h header explicitly to hypercalls file, and avoid depending on it being included somewhere else. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- drivers/lguest/hypercalls.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/lguest/hypercalls.c b/drivers/lguest/hypercalls.c index 32666d0..0f2cb4f 100644 --- a/drivers/lguest/hypercalls.c +++ b/drivers/lguest/hypercalls.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include #include "lg.h" -- 1.5.0.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/6] lguest patches for compiling x86_64
Right now, I have lguest in-tree module compiling on x86_64. It's not yet on a sendable state, since the module itself isn't loading. However, this subset of the series is pretty straightforward, and I'm sending it now aiming at reducing the delta size in the future ;-) Have fun, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] Makes lguest's irq handler typesafe
Tejun Heo wrote: > so I think the question is "do we want to change all callbacks to > take native pointer type instead of void pointer?". Lemme clarity myself a bit. I'm not saying that we should convert all at once or literally every callback should be converted. What I'm saying is whether we're headed that way in general and converting big ones - timer for example - and getting the conversion agreed upon should be enough to set the norm. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] Makes lguest's irq handler typesafe
Hello, Rusty. Rusty Russell wrote: > On Saturday 19 January 2008 10:12:33 Tejun Heo wrote: >> Type safety is good but I doubt this would be worth the complexity. It >> has some benefits but there's much larger benefit in keeping things in >> straight C. People know that functions take fixed types and are also >> familiar with the convention of passing void * for callback arguments. >> IMHO, staying in line with those common knowledges easily trumps having >> type checking on interrupt handler. > > I sympathise with this argument, but I think just because people are familiar > with existing hacks shouldn't prevent improvement. I think the resulting > code is clearer and more readable. > > Even in the implementation, the tricky part is the check_either_type() macro: > the rest is straight-forward. The change is a small one and both the cost and benefit aren't big. >> Also, how often do we see a bug where things go wrong because interrupt >> handler is given the wrong type of argument? Even when such bug >> happens, I doubt it can escape the developer's workstation if he/she is >> paying any attention to testing. > > I agree this one is unlikely. But I am trying to spread type-safety more > widely (see previous kthread patches). > > I like changing the kernel to make life simpler for developers. We don't do > enough of it. I'm in full agreement here but the cost / benefit equation doesn't seem quite right to me. If we're gonna convert all callbacks to take native pointers, I'm fine with the irq handler part too. If not, it just adds confusion which is much worse than any benefit it can bring, so I think the question is "do we want to change all callbacks to take native pointer type instead of void pointer?". -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
patch driver-core-constify-the-name-passed-to-platform_device_register_simple.patch added to gregkh-2.6 tree
This is a note to let you know that I've just added the patch titled Subject: Driver Core: constify the name passed to platform_device_register_simple to my gregkh-2.6 tree. Its filename is driver-core-constify-the-name-passed-to-platform_device_register_simple.patch This tree can be found at http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/ >From [EMAIL PROTECTED] Fri Jan 18 17:28:36 2008 From: Stephen Rothwell <[EMAIL PROTECTED]> Date: Fri, 11 Jan 2008 17:24:53 +1100 Subject: Driver Core: constify the name passed to platform_device_register_simple To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], LKML Message-ID: <[EMAIL PROTECTED]> This name is just passed to platform_device_alloc which has its parameter declared const. Signed-off-by: Stephen Rothwell <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/base/platform.c |2 +- include/linux/platform_device.h |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -360,7 +360,7 @@ EXPORT_SYMBOL_GPL(platform_device_unregi * the Linux driver model. In particular, when such drivers are built * as modules, they can't be "hotplugged". */ -struct platform_device *platform_device_register_simple(char *name, int id, +struct platform_device *platform_device_register_simple(const char *name, int id, struct resource *res, unsigned int num) { struct platform_device *pdev; --- a/include/linux/platform_device.h +++ b/include/linux/platform_device.h @@ -35,7 +35,7 @@ extern struct resource *platform_get_res extern int platform_get_irq_byname(struct platform_device *, char *); extern int platform_add_devices(struct platform_device **, int); -extern struct platform_device *platform_device_register_simple(char *, int id, +extern struct platform_device *platform_device_register_simple(const char *, int id, struct resource *, unsigned int); extern struct platform_device *platform_device_alloc(const char *name, int id); Patches currently in gregkh-2.6 which might be from [EMAIL PROTECTED] are bad/battery-class-driver.patch driver/driver-core-constify-the-name-passed-to-platform_device_register_simple.patch -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] Improve type handling in interrupt handlers
On Saturday 19 January 2008 07:41:41 Jeff Garzik wrote: > FWIW, I have been working in this area extensively. Excellent... > Check out the 'irq-cleanups' and 'irq-remove' branches of > git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/misc-2.6.git Your irq-cleanups branch is nice work! But AFAICT these patches are not included in your irq-cleanups branch. Did you want me to switch my patch over to irqreturn_t and send them for you to roll in? Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/4] introduce __devinitconst
On Fri, Jan 11, 2008 at 01:57:27AM -0700, Jan Beulich wrote: > The drivers picked just serve as examples (which I routinely build and > hence am able to easily verify), i.e. as before he patch doesn't change > all instances where 'const' could have been added as a result of the > base change, only where the change has a real effect (the module loader > doesn't enforce read-only section attributes at present, so only > built-in files make a real difference). What does this buy us? > --- 2.6.24-rc7-initconst.orig/include/linux/init.h > +++ 2.6.24-rc7-initconst/include/linux/init.h > @@ -257,11 +257,13 @@ void __init parse_early_param(void); > #ifdef CONFIG_HOTPLUG > #define __devinit > #define __devinitdata > +#define __devinitconst const > #define __devexit > #define __devexitdata > #else > #define __devinit __init > #define __devinitdata __initdata > +#define __devinitconst __initdata Shoudn't that be "__initdata const" or something like that? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] constify struct attribute_group uses
On Fri, Jan 11, 2008 at 08:37:55AM +, Jan Beulich wrote: > .. as all consumers of it don't require it to be modifiable. > > Unfortunately, due to the two-level constifications, this required > touching quite many files, not all of which I am able to test - please > bare with eventual mistakes or oversights. > > The patch doesn't change all instances where 'const' could have been > added as a result of the base structure changes, only where either the > change has a real effect (the module loader doesn't enforce read-only > section attributes at present, so only built-in files matter) or where > compiler warnings would result otherwise. Hm, code in these areas has changed a lot in -mm, can you respin this against that tree to catch all of the different attribute changes that has happened? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] Makes lguest's irq handler typesafe
On Saturday 19 January 2008 10:12:33 Tejun Heo wrote: > Type safety is good but I doubt this would be worth the complexity. It > has some benefits but there's much larger benefit in keeping things in > straight C. People know that functions take fixed types and are also > familiar with the convention of passing void * for callback arguments. > IMHO, staying in line with those common knowledges easily trumps having > type checking on interrupt handler. I sympathise with this argument, but I think just because people are familiar with existing hacks shouldn't prevent improvement. I think the resulting code is clearer and more readable. Even in the implementation, the tricky part is the check_either_type() macro: the rest is straight-forward. > Also, how often do we see a bug where things go wrong because interrupt > handler is given the wrong type of argument? Even when such bug > happens, I doubt it can escape the developer's workstation if he/she is > paying any attention to testing. I agree this one is unlikely. But I am trying to spread type-safety more widely (see previous kthread patches). I like changing the kernel to make life simpler for developers. We don't do enough of it. Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/10] Tree fixes for PARAVIRT
On Jan 18, 2008 8:02 PM, Ingo Molnar <[EMAIL PROTECTED]> wrote: > > * Zachary Amsden <[EMAIL PROTECTED]> wrote: > > > > but in exchange you broke all of 32-bit with CONFIG_PARAVIRT=y. > > > Which means you did not even build-test it on 32-bit, let alone boot > > > test it... > > > > Why are we rushing so much to do 64-bit paravirt that we are breaking > > working configurations? If the developement is going to be this > > chaotic, it should be done and tested out of tree until it can > > stabilize. > > what you see is a open feedback cycle conducted on lkml. People send > patches for arch/x86, and we tell them if it breaks something. The bug > was found before i pushed out the x86.git devel tree (and the fix is > below - but this shouldnt matter to you because the bug never hit a > public x86.git tree). > > Ingo > Other than this, it seems to build and boot fine. Do you want me to resend ? -- Glauber de Oliveira Costa. "Free as in Freedom" http://glommer.net "The less confident you are, the more serious you have to act." -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Known prob: MAX_LOCK_DEPTH too low?
On my x86_64 machine, I got the following message in log (kern = 2.6.23.14) Jan 16 04:08:38 Astara kernel: BUG: MAX_LOCK_DEPTH too low! Jan 16 04:08:38 Astara kernel: turning off the locking correctness validator. Have no idea what caused it as I found the message on my console somewhat after the fact. The system had been up over 24 hours and is still running. System still seems 'fine' (been up 3 days now), so you can treat this as a "data point". -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/10] add missing parameter for lookup_address
On Fri, Jan 18, 2008 at 12:26:13PM -0800, Chris Wright wrote: > * Glauber de Oliveira Costa ([EMAIL PROTECTED]) wrote: > > lookup_address() receives two parameters, but efi_64.c call > > is passing only one. It's actually preventing the tree from compiling > > > > Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> > > Good catch, I know I don't test with CONFIG_EFI=y Ah that came probably from the CPA patchset which added the parameter. Sorry for that. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GEODE] Geode GX/LX watchdog timer (was 2.6.24-rc8 hangs at mfgpt-timer)
On 17/01/08 23:52 +0100, Arnd Hannemann wrote: > >> Watchdog for the new API would be great :-) > > > > Coming soon. As promised, a watchdog driver for the Geode GX/LX processors is attached. I basically just ported the previous patch forward to 2.6.24. I also have good news or bad news depending on your perspective. I wanted to test this against 2.6.24, and OLPC is stuck at an older kernel version, so I had to test this with coreboot (LinuxBIOS) on another Geode platform. Like all BIOSen execpt for the OLPC firmware, coreboot uses VSA (SMM handler) which consumes all the timers. So I used the magical MSR and surprise! - the timer tick hung. I compiled out the timer tick, and tested the watchdog timer instead, and it worked fine on timer 0. So I don't think the MFGPTs themselves have anything to do with this problem, but I do think it might be related to VSA and possibly interrupts too. I'm going to invoke the strong BIOS fu of our LinuxBIOS / BIOS expert Marc Jones, and see what he comes up with. I don't know how much of a hassle it would be for Andres to get a 2.6.24 kernel running on the OLPC to make sure that this isn't a regression in the timer tick code (I suspect it isn't a regression, but you never know). I also think that it would probably be in our best interest to default CONFIG_GEODE_MFGPT_TIMER to 'n' until we get this figured out. Since most BIOSen don't have timers available, that shouldn't affect too many people. So, anyway, enjoy the watchdog timer - I hope it meets everybody's expectations for the 2.6.25 kernel. Jordan -- Jordan Crouse Systems Software Development Engineer Advanced Micro Devices, Inc. [GEODE] Add a watchdog driver based on the CS5535/CS5536 MFGPT timers From: Jordan Crouse <[EMAIL PROTECTED]> Add a watchdog timer based on the MFGPT timers in the CS5535/CS5536 companion chips to the AMD Geode GX and LX processors. Only caveat is that the BIOS must provide at least a one free timer, and most do not. Signed-off-by: Jordan Crouse <[EMAIL PROTECTED]> --- drivers/watchdog/Kconfig| 13 ++ drivers/watchdog/Makefile |1 drivers/watchdog/geodewdt.c | 321 +++ 3 files changed, 335 insertions(+), 0 deletions(-) Index: git/drivers/watchdog/Kconfig === --- git.orig/drivers/watchdog/Kconfig 2008-01-18 15:06:44.0 -0700 +++ git/drivers/watchdog/Kconfig2008-01-18 17:50:25.0 -0700 @@ -295,6 +295,20 @@ Most people will say N. +config GEODE_WDT + tristate "AMD Geode CS5535/CS5536 Watchdog" + depends on MGEODE_LX + default n + help + This driver enables a watchdog capability built into the +CS5535/CS5536 companion chips for the AMD Geode GX and LX +processors. This watchdog watches your kernel to make sure +it doesn't freeze, and if it does, it reboots your computer after +a certain amount of time. + +You can compile this driver directly into the kernel, or use +it as a module. The module will be called geodewdt. + config SC520_WDT tristate "AMD Elan SC520 processor Watchdog" depends on X86 Index: git/drivers/watchdog/Makefile === --- git.orig/drivers/watchdog/Makefile 2008-01-18 15:06:44.0 -0700 +++ git/drivers/watchdog/Makefile 2008-01-18 16:32:15.0 -0700 @@ -59,6 +59,7 @@ obj-$(CONFIG_ADVANTECH_WDT) += advantechwdt.o obj-$(CONFIG_ALIM1535_WDT) += alim1535_wdt.o obj-$(CONFIG_ALIM7101_WDT) += alim7101_wdt.o +obj-$(CONFIG_GEODE_WDT) += geodewdt.o obj-$(CONFIG_SC520_WDT) += sc520_wdt.o obj-$(CONFIG_EUROTECH_WDT) += eurotechwdt.o obj-$(CONFIG_IB700_WDT) += ib700wdt.o Index: git/drivers/watchdog/geodewdt.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ git/drivers/watchdog/geodewdt.c 2008-01-18 17:47:39.0 -0700 @@ -0,0 +1,308 @@ +/* Watchdog timer for the Geode GX/LX with the CS5535/CS5536 companion chip + * + * Copyright (C) 2006-2007, Advanced Micro Devices, Inc. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + + +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#define GEODEWDT_HZ 500 +#define GEODEWDT_SCALE 6 +#define GEODEWDT_MAX_SECONDS 131 + +#define WDT_FLAGS_OPEN 1 +#define WDT_FLAGS_ORPHAN 2 + +#define DRV_NAME "geodewdt" +#define WATCHDOG_NAME "Geode GX/LX WDT" +#define WATCHDOG_TIMEOUT 60 + +static int timeout = WATCHDOG_TIMEOUT; +module_param(timeout, int, 0); +MODULE_PARM_DESC(timeout, "Watchdog timeout in seconds. 1<= timeout <=131, default=" __M
Re: [RFC] Per-thread getrusage
I agree that RUSAGE_THREAD is fine. (In fact, if you'd pressed me to remember without looking, I would have assumed we put it in already.) However, in the implementation, I would keep it cleaner by moving the identical code from inside the loop under case RUSAGE_SELF into a shared subfunction, rather than duplicating it. In fact, here you go (next posting). As to getting arbitrary other threads' data, there are several problems there. Adding a syscall is often more trouble than it's worth. Ulrich cited the issues with that as the API. You also didn't handle compat for it correctly. To warrant the code necessary to make this available by whatever API, I think you need to say some more about what it's needed for. Off hand, it seems most in keeping with other things to expose this via a /proc file, i.e. /proc/tgid/task/tid/rusage and (/proc/tgid/rusage for the RUSAGE_SELF behavior on a foreign process). There we already have the infrastructure for dealing with the security issues uniformly with how we control other similar information. Personally I tend to prefer a binary interface, i.e. a virtual file whose contents are struct rusage; for that you still need to do the extra compat work, since a 32-bit process should have the 32-bit struct rusage layout in its /proc files. If you put the numbers into ascii text as some /proc interfaces do, you don't need any special considerations for CONFIG_COMPAT. Thanks, Roland -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] RUSAGE_THREAD
This adds the RUSAGE_THREAD option for the getrusage system call. Solaris calls this RUSAGE_LWP and uses the same value (1). That name is not a natural one for Linux, but we keep it as an alias. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- include/linux/resource.h |2 ++ kernel/sys.c | 31 ++- 2 files changed, 24 insertions(+), 9 deletions(-) diff --git a/include/linux/resource.h b/include/linux/resource.h index ae13db7..02b3377 100644 --- a/include/linux/resource.h +++ b/include/linux/resource.h @@ -19,6 +19,8 @@ struct task_struct; #defineRUSAGE_SELF 0 #defineRUSAGE_CHILDREN (-1) #define RUSAGE_BOTH(-2)/* sys_wait4() uses this */ +#defineRUSAGE_THREAD 1 /* only the calling thread */ +#defineRUSAGE_LWP RUSAGE_THREAD /* Solaris name for same */ struct rusage { struct timeval ru_utime;/* user time used */ diff --git a/kernel/sys.c b/kernel/sys.c index d1fe71e..6a62bc4 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -1554,6 +1554,19 @@ out: * */ +static void accumulate_thread_rusage(struct task_struct *t, struct rusage *r, +cputime_t *utimep, cputime_t *stimep) +{ + *utimep = cputime_add(*utimep, t->utime); + *stimep = cputime_add(*stimep, t->stime); + r->ru_nvcsw += t->nvcsw; + r->ru_nivcsw += t->nivcsw; + r->ru_minflt += t->min_flt; + r->ru_majflt += t->maj_flt; + r->ru_inblock += task_io_get_inblock(t); + r->ru_oublock += task_io_get_oublock(t); +} + static void k_getrusage(struct task_struct *p, int who, struct rusage *r) { struct task_struct *t; @@ -1563,6 +1576,11 @@ static void k_getrusage(struct task_struct *p, int who, struct rusage *r) memset((char *) r, 0, sizeof *r); utime = stime = cputime_zero; + if (who == RUSAGE_THREAD) { + accumulate_thread_rusage(p, r, &utime, &stime); + goto out; + } + rcu_read_lock(); if (!lock_task_sighand(p, &flags)) { rcu_read_unlock(); @@ -1595,14 +1613,7 @@ static void k_getrusage(struct task_struct *p, int who, struct rusage *r) r->ru_oublock += p->signal->oublock; t = p; do { - utime = cputime_add(utime, t->utime); - stime = cputime_add(stime, t->stime); - r->ru_nvcsw += t->nvcsw; - r->ru_nivcsw += t->nivcsw; - r->ru_minflt += t->min_flt; - r->ru_majflt += t->maj_flt; - r->ru_inblock += task_io_get_inblock(t); - r->ru_oublock += task_io_get_oublock(t); + accumulate_thread_rusage(t, r, &utime, &stime); t = next_thread(t); } while (t != p); break; @@ -1614,6 +1625,7 @@ static void k_getrusage(struct task_struct *p, int who, struct rusage *r) unlock_task_sighand(p, &flags); rcu_read_unlock(); +out: cputime_to_timeval(utime, &r->ru_utime); cputime_to_timeval(stime, &r->ru_stime); } @@ -1627,7 +1639,8 @@ int getrusage(struct task_struct *p, int who, struct rusage __user *ru) asmlinkage long sys_getrusage(int who, struct rusage __user *ru) { - if (who != RUSAGE_SELF && who != RUSAGE_CHILDREN) + if (who != RUSAGE_SELF && who != RUSAGE_CHILDREN && + who != RUSAGE_THREAD) return -EINVAL; return getrusage(current, who, ru); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: Celeron Core
On Fri, Jan 18, 2008 at 06:27:57PM -0600, Matt Mackall wrote: > > On Fri, 2008-01-18 at 22:11 +0100, Andi Kleen wrote: > > Chodorenko Michail <[EMAIL PROTECTED]> writes: > > > > > I have a laptop "Extensa 5220", with the processor Celeron based on 'core' > > > technology. > > > There is ~ / arch/i386/kernel/cpu/cpufreq/p4-clockmod.c in the kernel > > > source code > > > but there's no line identification of my CPU for apply freqency change > > > need to add a ID line 0х16 > > > > Note that driver will likely do clock throttling on your CPU. > > Using that is usually a bad idea because it does not actually > > safe power. It's only intended to let the CPU cool down in some situations. > > Power consumption is more or less exactly equal to heat production > (that's where the power goes, after all!), so either clock throttling > DOES save power or it DOES NOT cool the CPU. No actually the way it works on modern x86 CPUs is that the best strategy for saving power is to do things quickly and then idle longer. That means on anything that has reasonably deep sleep modi e.g. on older server/desktop systems things might be slightly different because they had very little power saving features enabled, but it's definitely true for all laptop systems from the last several years. But even on desktop/server throttling tends to be a bad idea. Intel style throttling makes the CPU skip cycles so the maximum built up heat for a time unit is less, but it will run active for longer that makes it overall take more power for a given work unit. Here's a better description from Dominik: http://article.gmane.org/gmane.linux.kernel.cpufreq/3497 Note the conditions he describes are quite common. Also the OP CPU likely has C2 and even deeper sleep modi. Another problem with throttling / p4-clockmod is that on at least some CPUs (not necessarily P-M, but we saw this on some P4s) is that they can create quite long user visible latencies. You might actually get "hanging mouse pointers" from it if you use it with an aggressive governour like ondemand. The normal use case for Intel throttling is to just do an emergency cool down in case the CPU fails (down to thermal shutdown). And that is done transparently behind Linux's back. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -v6 2/2] Updating ctime and mtime for memory-mapped files
On Fri, 2008-01-18 at 17:54 -0500, Rik van Riel wrote: > On Fri, 18 Jan 2008 14:47:33 -0800 (PST) > Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > - keep it simple. Let's face it, Linux has never ever given those > >guarantees before, and it's not is if anybody has really cared. Even > >now, the issue seems to be more about paper standards conformance than > >anything else. > > There is one issue which is way more than just standards conformance. > > When a program changes file data through mmap(), at some point the > mtime needs to be update so that backup programs know to back up the > new version of the file. > > Backup programs not seeing an updated mtime is a really big deal. And that's fixed with the 4-line approach. Reminds me, I've got a patch here for addressing that problem with loop mounts: Writes to loop should update the mtime of the underlying file. Signed-off-by: Matt Mackall <[EMAIL PROTECTED]> Index: l/drivers/block/loop.c === --- l.orig/drivers/block/loop.c 2007-11-05 17:50:07.0 -0600 +++ l/drivers/block/loop.c 2007-11-05 19:03:51.0 -0600 @@ -221,6 +221,7 @@ static int do_lo_send_aops(struct loop_d offset = pos & ((pgoff_t)PAGE_CACHE_SIZE - 1); bv_offs = bvec->bv_offset; len = bvec->bv_len; + file_update_time(file); while (len > 0) { sector_t IV; unsigned size; @@ -299,6 +300,7 @@ static int __do_lo_send_write(struct fil set_fs(get_ds()); bw = file->f_op->write(file, buf, len, &pos); + file_update_time(file); set_fs(old_fs); if (likely(bw == len)) return 0; -- Mathematics is the supreme nostalgia of our time. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] usb-serial: Sierra driver - add devices and update dtr
On Thu, Jan 17, 2008 at 03:15:23PM -0800, Kevin Lloyd wrote: > > > Correct, the 0x0023 is the only newly added device that requires the > new > > > features. > > > > Does that mean things will not work for this device if it is added to > > the device table, without the code updates? > Adding the device will not break the driver (assuming you remove the > tag). Which "tag"? The device id? > > And is this device even public yet? > > No, but we are trying to add native support for devices into kernels > well before they are released in an effort give better native support > to end-users. Ok, that's great to do, and is what needs to be done, just can't add new features during the "bug-fix-only" cycle of development :) thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
patch pm-acquire-device-locks-on-suspend.patch added to gregkh-2.6 tree
This is a note to let you know that I've just added the patch titled Subject: PM: Acquire device locks on suspend to my gregkh-2.6 tree. Its filename is pm-acquire-device-locks-on-suspend.patch This tree can be found at http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/ >From [EMAIL PROTECTED] Fri Jan 18 16:29:07 2008 From: "Rafael J. Wysocki" <[EMAIL PROTECTED]> Date: Sat, 12 Jan 2008 20:40:46 +0100 Subject: PM: Acquire device locks on suspend To: Greg KH <[EMAIL PROTECTED]> Cc: Alan Stern <[EMAIL PROTECTED]>, Len Brown <[EMAIL PROTECTED]>, Ingo Molnar <[EMAIL PROTECTED]>, ACPI Devel Maling List <[EMAIL PROTECTED]>, pm list <[EMAIL PROTECTED]>, LKML , Johannes Berg <[EMAIL PROTECTED]>, Andrew Morton <[EMAIL PROTECTED]> Message-ID: <[EMAIL PROTECTED]> Content-Disposition: inline From: Rafael J. Wysocki <[EMAIL PROTECTED]> This patch reorganizes the way suspend and resume notifications are sent to drivers. The major changes are that now the PM core acquires every device semaphore before calling the methods, and calls to device_add() during suspends will fail, while calls to device_del() during suspends will block. It also provides a way to safely remove a suspended device with the help of the PM core, by using the device_pm_schedule_removal() callback introduced specifically for this purpose, and updates two drivers (msr and cpuid) that need to use it. Signed-off-by: Alan Stern <[EMAIL PROTECTED]> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- arch/x86/kernel/cpuid.c|6 arch/x86/kernel/msr.c |6 drivers/base/core.c| 65 + drivers/base/power/main.c | 504 + drivers/base/power/power.h | 12 + include/linux/device.h |8 6 files changed, 414 insertions(+), 187 deletions(-) --- a/arch/x86/kernel/cpuid.c +++ b/arch/x86/kernel/cpuid.c @@ -157,15 +157,15 @@ static int __cpuinit cpuid_class_cpu_cal switch (action) { case CPU_UP_PREPARE: - case CPU_UP_PREPARE_FROZEN: err = cpuid_device_create(cpu); break; case CPU_UP_CANCELED: - case CPU_UP_CANCELED_FROZEN: case CPU_DEAD: - case CPU_DEAD_FROZEN: cpuid_device_destroy(cpu); break; + case CPU_UP_CANCELED_FROZEN: + destroy_suspended_device(cpuid_class, MKDEV(CPUID_MAJOR, cpu)); + break; } return err ? NOTIFY_BAD : NOTIFY_OK; } --- a/arch/x86/kernel/msr.c +++ b/arch/x86/kernel/msr.c @@ -155,15 +155,15 @@ static int __cpuinit msr_class_cpu_callb switch (action) { case CPU_UP_PREPARE: - case CPU_UP_PREPARE_FROZEN: err = msr_device_create(cpu); break; case CPU_UP_CANCELED: - case CPU_UP_CANCELED_FROZEN: case CPU_DEAD: - case CPU_DEAD_FROZEN: msr_device_destroy(cpu); break; + case CPU_UP_CANCELED_FROZEN: + destroy_suspended_device(msr_class, MKDEV(MSR_MAJOR, cpu)); + break; } return err ? NOTIFY_BAD : NOTIFY_OK; } --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -726,11 +726,20 @@ int device_add(struct device *dev) { struct device *parent = NULL; struct class_interface *class_intf; - int error = -EINVAL; + int error; + + error = pm_sleep_lock(); + if (error) { + dev_warn(dev, "Suspicious %s during suspend\n", __FUNCTION__); + dump_stack(); + return error; + } dev = get_device(dev); - if (!dev || !strlen(dev->bus_id)) + if (!dev || !strlen(dev->bus_id)) { + error = -EINVAL; goto Error; + } pr_debug("DEV: registering device: ID = '%s'\n", dev->bus_id); @@ -795,6 +804,7 @@ int device_add(struct device *dev) } Done: put_device(dev); + pm_sleep_unlock(); return error; BusError: device_pm_remove(dev); @@ -905,6 +915,7 @@ void device_del(struct device * dev) struct device * parent = dev->parent; struct class_interface *class_intf; + device_pm_remove(dev); if (parent) klist_del(&dev->knode_parent); if (MAJOR(dev->devt)) @@ -981,7 +992,6 @@ void device_del(struct device * dev) if (dev->bus) blocking_notifier_call_chain(&dev->bus->bus_notifier, BUS_NOTIFY_DEL_DEVICE, dev); - device_pm_remove(dev); kobject_uevent(&dev->kobj, KOBJ_REMOVE); kobject_del(&dev->kobj); if (parent) @@ -1156,14 +1166,11 @@ error: EXPORT_SYMBOL_GPL(device_create); /** - * device_destroy - removes a device that was created with device_create() + * find_device - finds a device that was created with device_create() *
Re: PROBLEM: Celeron Core
On Fri, 2008-01-18 at 22:11 +0100, Andi Kleen wrote: > Chodorenko Michail <[EMAIL PROTECTED]> writes: > > > I have a laptop "Extensa 5220", with the processor Celeron based on 'core' > > technology. > > There is ~ / arch/i386/kernel/cpu/cpufreq/p4-clockmod.c in the kernel > > source code > > but there's no line identification of my CPU for apply freqency change > > need to add a ID line 0х16 > > Note that driver will likely do clock throttling on your CPU. > Using that is usually a bad idea because it does not actually > safe power. It's only intended to let the CPU cool down in some situations. Power consumption is more or less exactly equal to heat production (that's where the power goes, after all!), so either clock throttling DOES save power or it DOES NOT cool the CPU. -- Mathematics is the supreme nostalgia of our time. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86: kdump failure
Oops, I overlooked the use of elf_core_copy_regs in kernel/kexec.c. It is certainly safe and fine to reintroduce the old macro. Everything removed in the "x86 user_regset cleanup" patch is purely removing code and it doesn't hurt to have it back (it's just all unused except for this kexec nit). Unfortunately it really doesn't fit to have kexec call into the new user_regset code that replaced this macro for user core dump purposes. Those new interfaces are really purely for user-mode state, derived only from task_struct (i.e. uses task_pt_regs), not from a struct pt_regs pointer passed in. (There is the minority case where it really is using user-mode state. That part could be done via the user_regset interface, if that saved any trouble.) Things like crash_fixup_ss_esp point to the poor fit of the code intended for user core dumps with what kexec needs. IMHO it would be cleaner for kexec's arch interfaces to fill in elf_gregset_t directly, replacing some of the places a struct pt_regs is passed around now. crash_setup_regs already has to know the name of every register anyway. A particular arch's definition can share code with its core dump or user_regset code when that fits. Thanks, Roland -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git patches] net driver updates for 2.6.25
From: Jeff Garzik <[EMAIL PROTECTED]> Date: Fri, 18 Jan 2008 15:17:21 -0500 > > Please pull from the 'upstream' branch of > git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git upstream > > to receive my 2.6.25 net driver queue into davem/net-2.6.25.git: Pulled and pushed back out, thanks Jeff. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SCSI: fix isa/pcmcia compile problem
James Bottomley wrote: >> I'm just a bit reluctant to touch these drivers, since they're all >> incredibly ancient. We don't have good luck with simple transformation >> patches on the older drivers ... and it seems to take months before >> anyone notices there's a problem. > > This is the patch that will return them to their original behaviour. > > James > > --- > diff --git a/drivers/scsi/pcmcia/Kconfig b/drivers/scsi/pcmcia/Kconfig > index fa481b5..53857c6 100644 > --- a/drivers/scsi/pcmcia/Kconfig > +++ b/drivers/scsi/pcmcia/Kconfig > @@ -6,7 +6,8 @@ menuconfig SCSI_LOWLEVEL_PCMCIA > bool "PCMCIA SCSI adapter support" > depends on SCSI!=n && PCMCIA!=n > > -if SCSI_LOWLEVEL_PCMCIA && SCSI && PCMCIA > +# drivers have problems when build in, so require modules > +if SCSI_LOWLEVEL_PCMCIA && SCSI && PCMCIA && m > > config PCMCIA_AHA152X > tristate "Adaptec AHA152X PCMCIA support" > > Looks good to me. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Shrink ext3_inode_info by 8 bytes for !POSIX_ACL.
On Fri, January 18, 2008 20:16, Mingming Cao wrote: > On Sat, 2008-01-12 at 21:35 +0100, Indan Zupancic wrote: >> i_file_acl and i_dir_acl aren't always needed. >> >> With certain configs this makes 10 ext3_inode_cache objects fit in >> one slab instead of the current 9, as the size shrinks from 416 to >> 408 bytes for 32 bit, !POSIX_ACL and !EXT3_FS_XATTR configs. >> >> Signed-off-by: Indan Zupancic <[EMAIL PROTECTED]> >> --- >> fs/ext3/ialloc.c |2 ++ >> fs/ext3/inode.c | 29 +++-- >> include/linux/ext3_fs_i.h |2 ++ >> 3 files changed, 23 insertions(+), 10 deletions(-) >> >> diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c >> index 1bc8cd8..01745bc 100644 >> --- a/fs/ext3/ialloc.c >> +++ b/fs/ext3/ialloc.c >> @@ -574,8 +574,10 @@ got: >> ei->i_frag_no = 0; >> ei->i_frag_size = 0; >> #endif >> +#ifdef CONFIG_EXT3_FS_POSIX_ACL >> ei->i_file_acl = 0; >> ei->i_dir_acl = 0; >> +#endif > > For regular file, i_dir_acl is being reused as i_size_high to support > large file. Only the i_dir_acl of struct ext3_inode, not the one from ext3_inode_info. Thanks, Indan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SCSI: fix isa/pcmcia compile problem
On Fri, 2008-01-18 at 17:32 -0600, James Bottomley wrote: > On Sat, 2008-01-19 at 08:27 +0900, Tejun Heo wrote: > > James Bottomley wrote: > > > On Fri, 2008-01-18 at 16:20 +0900, Tejun Heo wrote: > > >> aha152x.c and fdomain are built twice - once for the isa driver and > > >> once for the PCMCIA one. Through #ifdefs, the compiled codes are > > >> slightly different; thus, global symbols need to be given different > > >> names depending on which flavor is being built. This patch adds > > >> GLOBAL() macro to aha152x.h and fdomain.h which change the symbol > > >> depending on PCMCIA. > > >> > > >> This bug has always existed but has been masked by the fact the > > >> drivers/scsi/pcmcia used subdir-(y|m) instead of obj-(y|m) which made > > >> drivers/scsi/pcmcia/built_in.o not linked into the kernel and thus > > >> avoided the duplicate symbols during compilation. > > >> > > >> Signed-off-by: Tejun Heo <[EMAIL PROTECTED]> > > >> --- > > >> Ah... missed that one. Here's the updated version. > > > > > > Actually, isn't the better fix just to return to the original behaviour? > > > > > > As you pointed out, using the subdir instead of obj meant that although > > > the modules were built, the drivers were never linked into the main > > > kernel. According to the records, this has been the default forever, so > > > there can be no-one anywhere relying on these drivers being built in. > > > Actually, as old style pcmcia drivers, I'm not sure there's much value > > > building them into the kernel anyway. > > > > > > So just modify scsi/pcmcia/Kconfig to make them all depend on m. > > > > Yeap, there is no problem if you don't allow them to be linked into the > > kernel. If that's how you want it, please go ahead. > > > > I personally think it's a bit odd to disallow building into kernel > > because of the peculiarity of the implementation (including c files and > > compiling them slightly differently) and also no one reporting doesn't > > necessarily mean no one has attempted it and failed. > > Heh ... I'll make you a deal. Find just one user of one of these > drivers who can make use of them built in, and I'll apply the patch. > > I'm just a bit reluctant to touch these drivers, since they're all > incredibly ancient. We don't have good luck with simple transformation > patches on the older drivers ... and it seems to take months before > anyone notices there's a problem. This is the patch that will return them to their original behaviour. James --- diff --git a/drivers/scsi/pcmcia/Kconfig b/drivers/scsi/pcmcia/Kconfig index fa481b5..53857c6 100644 --- a/drivers/scsi/pcmcia/Kconfig +++ b/drivers/scsi/pcmcia/Kconfig @@ -6,7 +6,8 @@ menuconfig SCSI_LOWLEVEL_PCMCIA bool "PCMCIA SCSI adapter support" depends on SCSI!=n && PCMCIA!=n -if SCSI_LOWLEVEL_PCMCIA && SCSI && PCMCIA +# drivers have problems when build in, so require modules +if SCSI_LOWLEVEL_PCMCIA && SCSI && PCMCIA && m config PCMCIA_AHA152X tristate "Adaptec AHA152X PCMCIA support" -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SCSI: fix isa/pcmcia compile problem
James Bottomley wrote: >> I personally think it's a bit odd to disallow building into kernel >> because of the peculiarity of the implementation (including c files and >> compiling them slightly differently) and also no one reporting doesn't >> necessarily mean no one has attempted it and failed. > > Heh ... I'll make you a deal. Find just one user of one of these > drivers who can make use of them built in, and I'll apply the patch. I don't think I can. I didn't even know they were isa ones before actually looking at the code. > I'm just a bit reluctant to touch these drivers, since they're all > incredibly ancient. We don't have good luck with simple transformation > patches on the older drivers ... and it seems to take months before > anyone notices there's a problem. Alright then, please go ahead and disallow built-in. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Why is the kfree() argument const?
"J.A. Magallón" <[EMAIL PROTECTED]> writes: > That's what __attribute__ ((pure)) is for, but if none of the > functions is pure, the compiler can not be sure about side effects > and can not reorder things. Don't forget that functions can do > anything apart from mangling with their arguments. Though it seems it could legally transform: void kfree(const int *x); { int v, *ptr = malloc(sizeof(int)); *ptr = 51; v = *ptr; kfree(ptr); printf("%d", v); into: { int v, *ptr = malloc(sizeof(int)); *ptr = 51; kfree(ptr); v = *ptr; printf("%d", v); } if it knows that malloc generates unaliased pointers, which seems reasonable in general. -- Krzysztof Halasa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SCSI: fix isa/pcmcia compile problem
On Sat, 2008-01-19 at 08:27 +0900, Tejun Heo wrote: > James Bottomley wrote: > > On Fri, 2008-01-18 at 16:20 +0900, Tejun Heo wrote: > >> aha152x.c and fdomain are built twice - once for the isa driver and > >> once for the PCMCIA one. Through #ifdefs, the compiled codes are > >> slightly different; thus, global symbols need to be given different > >> names depending on which flavor is being built. This patch adds > >> GLOBAL() macro to aha152x.h and fdomain.h which change the symbol > >> depending on PCMCIA. > >> > >> This bug has always existed but has been masked by the fact the > >> drivers/scsi/pcmcia used subdir-(y|m) instead of obj-(y|m) which made > >> drivers/scsi/pcmcia/built_in.o not linked into the kernel and thus > >> avoided the duplicate symbols during compilation. > >> > >> Signed-off-by: Tejun Heo <[EMAIL PROTECTED]> > >> --- > >> Ah... missed that one. Here's the updated version. > > > > Actually, isn't the better fix just to return to the original behaviour? > > > > As you pointed out, using the subdir instead of obj meant that although > > the modules were built, the drivers were never linked into the main > > kernel. According to the records, this has been the default forever, so > > there can be no-one anywhere relying on these drivers being built in. > > Actually, as old style pcmcia drivers, I'm not sure there's much value > > building them into the kernel anyway. > > > > So just modify scsi/pcmcia/Kconfig to make them all depend on m. > > Yeap, there is no problem if you don't allow them to be linked into the > kernel. If that's how you want it, please go ahead. > > I personally think it's a bit odd to disallow building into kernel > because of the peculiarity of the implementation (including c files and > compiling them slightly differently) and also no one reporting doesn't > necessarily mean no one has attempted it and failed. Heh ... I'll make you a deal. Find just one user of one of these drivers who can make use of them built in, and I'll apply the patch. I'm just a bit reluctant to touch these drivers, since they're all incredibly ancient. We don't have good luck with simple transformation patches on the older drivers ... and it seems to take months before anyone notices there's a problem. James -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SCSI: fix isa/pcmcia compile problem
Tejun Heo wrote: > James Bottomley wrote: >> On Fri, 2008-01-18 at 16:20 +0900, Tejun Heo wrote: >>> aha152x.c and fdomain are built twice - once for the isa driver and >>> once for the PCMCIA one. Through #ifdefs, the compiled codes are >>> slightly different; thus, global symbols need to be given different >>> names depending on which flavor is being built. This patch adds >>> GLOBAL() macro to aha152x.h and fdomain.h which change the symbol >>> depending on PCMCIA. >>> >>> This bug has always existed but has been masked by the fact the >>> drivers/scsi/pcmcia used subdir-(y|m) instead of obj-(y|m) which made >>> drivers/scsi/pcmcia/built_in.o not linked into the kernel and thus >>> avoided the duplicate symbols during compilation. >>> >>> Signed-off-by: Tejun Heo <[EMAIL PROTECTED]> >>> --- >>> Ah... missed that one. Here's the updated version. >> Actually, isn't the better fix just to return to the original behaviour? >> >> As you pointed out, using the subdir instead of obj meant that although >> the modules were built, the drivers were never linked into the main >> kernel. According to the records, this has been the default forever, so >> there can be no-one anywhere relying on these drivers being built in. >> Actually, as old style pcmcia drivers, I'm not sure there's much value >> building them into the kernel anyway. >> >> So just modify scsi/pcmcia/Kconfig to make them all depend on m. > > Yeap, there is no problem if you don't allow them to be linked into the > kernel. If that's how you want it, please go ahead. > > I personally think it's a bit odd to disallow building into kernel > because of the peculiarity of the implementation (including c files and > compiling them slightly differently) and also no one reporting doesn't > necessarily mean no one has attempted it and failed. Actually what's better would be to make all symbols static and include the c file directly into the stub file. How about that? -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SCSI: fix isa/pcmcia compile problem
James Bottomley wrote: > On Fri, 2008-01-18 at 16:20 +0900, Tejun Heo wrote: >> aha152x.c and fdomain are built twice - once for the isa driver and >> once for the PCMCIA one. Through #ifdefs, the compiled codes are >> slightly different; thus, global symbols need to be given different >> names depending on which flavor is being built. This patch adds >> GLOBAL() macro to aha152x.h and fdomain.h which change the symbol >> depending on PCMCIA. >> >> This bug has always existed but has been masked by the fact the >> drivers/scsi/pcmcia used subdir-(y|m) instead of obj-(y|m) which made >> drivers/scsi/pcmcia/built_in.o not linked into the kernel and thus >> avoided the duplicate symbols during compilation. >> >> Signed-off-by: Tejun Heo <[EMAIL PROTECTED]> >> --- >> Ah... missed that one. Here's the updated version. > > Actually, isn't the better fix just to return to the original behaviour? > > As you pointed out, using the subdir instead of obj meant that although > the modules were built, the drivers were never linked into the main > kernel. According to the records, this has been the default forever, so > there can be no-one anywhere relying on these drivers being built in. > Actually, as old style pcmcia drivers, I'm not sure there's much value > building them into the kernel anyway. > > So just modify scsi/pcmcia/Kconfig to make them all depend on m. Yeap, there is no problem if you don't allow them to be linked into the kernel. If that's how you want it, please go ahead. I personally think it's a bit odd to disallow building into kernel because of the peculiarity of the implementation (including c files and compiling them slightly differently) and also no one reporting doesn't necessarily mean no one has attempted it and failed. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/3] Latencytop instrumentations part 1
Frank Ch. Eigler wrote: Hi - On Fri, Jan 18, 2008 at 02:33:34PM -0800, Arjan van de Ven wrote: [...] Can you suggest of some reason why all this instrumentation could not be in the form of standard markers (perhaps conditionally compiled out if necessary)? sure. Every instrumentation you see is of the nested kind (since the lowest level of nesting is already automatic via wchan). If markers can provide me the following semantics, I'd be MORE than happy to use markers: [...] If markers can provide that semantics ... you sold me. Further to what acme said, markers are semantics-free. Callback functions that implement your entry & exit semantics can be attached at run time, at your pleasure. (So can systemtap probes, for that matter.) The main difference would be that these callback functions would have manage the per-thread LIFO data structures themselves, instead of allocating backpointers on the kernel stack. (Bonus marks for not modifying task_struct. :-) modifying task struct to have storage space is no big deal... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 16/22 -v2] add get_monotonic_cycles
* Steven Rostedt ([EMAIL PROTECTED]) wrote: > On Fri, 18 Jan 2008, Mathieu Desnoyers wrote: > > > > But I have not seen a lot of situations where that kind of glue-code was > > needed, so I think it makes sense to keep markers simple to use and > > efficient for the common case. > > > > Then, in this glue-code, we can put trace_mark() and calls to in-kernel > > tracers. > > I'm almost done with the latency tracer work, and there are only a total > of 6 hooks that I needed. > > - schedule context switch > - try_to_wake_up > - hard_irqs_off (which is already there for lockdep) > - hard irqs on (also for lockdep) > - lock_contention (already in for the lock contention code) > - lock acquire (also in there for contention code) > > With the above, we could have this (if this is what I think you are > recommending). For example in the context_switch case: > > trace_switch_to(prev, next); > switch_to(prev, next, prev); > > and in sched.h I could have: > Almost.. I would add : static int trace_switch_to_enabled; > static inline trace_switch_to(struct task_struct *prev, > struct task_struct *next) > { if (likely(!trace_switch_to_enabled)) return; > trace_mark(kernel_schedudule, > "prev_pid %d next_pid %d prev_state %ld", > prev->pid, next->pid, prev->pid); > > trace_context_switch(prev, next); > } > And some code to activate the trace_switch_to_enabled variable (ideally keeping a refcount). By doing this, we would have the minimum impact on the scheduled when disabled. But remember that this trace_switch_to_enabled could be enabled for both markers and your tracer, so you might need to put a branch at the beginning of trace_context_switch() too. Mathieu > and have the trace_context_switch code be something that is turned on with > the latency tracing utility (config option). That way production code can > keep it off. > > -- Steve > -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/3] Latencytop instrumentations part 1
Hi - On Fri, Jan 18, 2008 at 02:33:34PM -0800, Arjan van de Ven wrote: > [...] > > Can you suggest of some reason why all this instrumentation could > > not be in the form of standard markers (perhaps conditionally > > compiled out if necessary)? > > sure. Every instrumentation you see is of the nested kind (since the lowest > level > of nesting is already automatic via wchan). > If markers can provide me the following semantics, I'd be MORE than happy to > use markers: > [...] > If markers can provide that semantics ... you sold me. Further to what acme said, markers are semantics-free. Callback functions that implement your entry & exit semantics can be attached at run time, at your pleasure. (So can systemtap probes, for that matter.) The main difference would be that these callback functions would have manage the per-thread LIFO data structures themselves, instead of allocating backpointers on the kernel stack. (Bonus marks for not modifying task_struct. :-) - FChE -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] Makes lguest's irq handler typesafe
Rusty Russell wrote: > Just a trivial example. > --- > drivers/lguest/lguest_device.c |3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff -r 00ab7672f658 drivers/lguest/lguest_device.c > --- a/drivers/lguest/lguest_device.c Thu Jan 17 16:54:00 2008 +1100 > +++ b/drivers/lguest/lguest_device.c Thu Jan 17 16:59:46 2008 +1100 > @@ -179,9 +179,8 @@ static void lg_notify(struct virtqueue * > hcall(LHCALL_NOTIFY, lvq->config.pfn << PAGE_SHIFT, 0, 0); > } > > -static irqreturn_t lguest_interrupt(int irq, void *_vq) > +static irqreturn_t lguest_interrupt(int irq, struct virtqueue *vq) > { > - struct virtqueue *vq = _vq; > struct lguest_device_desc *desc = to_lgdev(vq->vdev)->desc; > > if (unlikely(desc->config_change)) { Type safety is good but I doubt this would be worth the complexity. It has some benefits but there's much larger benefit in keeping things in straight C. People know that functions take fixed types and are also familiar with the convention of passing void * for callback arguments. IMHO, staying in line with those common knowledges easily trumps having type checking on interrupt handler. Also, how often do we see a bug where things go wrong because interrupt handler is given the wrong type of argument? Even when such bug happens, I doubt it can escape the developer's workstation if he/she is paying any attention to testing. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET] printk: implement printk_header() and merging printk
Matt Mackall wrote: > On Wed, 2008-01-16 at 10:00 +0900, Tejun Heo wrote: >> And mprintk the following. >> >> code: >> DEFINE_MPRINTK(mp, 2 * 80); >> >> mprintk_set_header(&mp, KERN_INFO "ata%u.%2u: ", 1, 0); >> mprintk_push(&mp, "ATA %d", 7); >> mprintk_push(&mp, ", %u sectors\n", 1024); >> mprintk(&mp, "everything seems dandy\n"); > > I prefer Matthew Wilcox's stringbuf approach which does proper memory > management and isn't specific to printk: > > http://www.ussg.iu.edu/hypermail/linux/kernel/0710.3/0517.html Yeap, that's generic and nice but I think both 'generic' and 'proper memory management' are weakness if what you're trying to do is to support collecting messages in pieces and putting it out via printk. Please consider the following scenario. You're in an interrupt handler and detected a severe error condition which should be notified to the user but the information is rather complex and best built in pieces, so you create a stringbuf and does sb_printf() to it w/ GFP_ATOMIC but alas memory allocation failed and you end up printing "out of memory" unless you detect the failure and go back and printk messages piece-by-piece manually. I would rather assemble the message manually from the get-go into an on-stack buffer. By being specifially 'printk' and let the user supply buffer, which in most cases can be on-stack (messages shouldn't be too long anyway), mprintk either can avoid those problems from the beginning or can automatically work around when problem arises (initialized with NULL buffer from allocation failure) without losing any message, so it's essentially as simple as using printk. There is no error handling (both mprintk and kfree can handle NULL pointer) and the message is guaranteed to go out no matter what. Auto-expanding string buffer is nice but I don't think it fits the bill here. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: crash in kmem_cache_init
On Fri, Jan 18, Christoph Lameter wrote: > Could you try this patch? Does not help, same crash. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -v6 2/2] Updating ctime and mtime for memory-mapped files
On Fri, 18 Jan 2008 14:47:33 -0800 (PST) Linus Torvalds <[EMAIL PROTECTED]> wrote: > - keep it simple. Let's face it, Linux has never ever given those >guarantees before, and it's not is if anybody has really cared. Even >now, the issue seems to be more about paper standards conformance than >anything else. There is one issue which is way more than just standards conformance. When a program changes file data through mmap(), at some point the mtime needs to be update so that backup programs know to back up the new version of the file. Backup programs not seeing an updated mtime is a really big deal. -- All rights reversed. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/