Some Serial Wacom Tablet devices failing to return from Hibernate/Suspend
A lot of Ubuntu users have noticed troubles with Wacom Serial Tablet devices (mainly builtin units in Toshiba tablet PCs) refusing to properly return from ACPI Suspend or Hibernate. See the bug report at: https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.22/+bug/152187 Tom Jaeger there noted that this bug seems to reside in some ACPI related IRQ handling code for such devices introduced in 2.6.21-rc4, via a patch to fix Parallel Port IRQs on resumes. You can see his whole bug report on the issue (including some tentative shot-in-the-dark patches) at http://bugzilla.kernel.org/show_bug.cgi?id=9487 This seems to be affecting a fair amount of users, and all of us active on the bug in the Ubuntu tracker aren't familiar enough with the ACPI subsystem to create a correct patch for this issue. Is anyone familiar enough with this area that they can provide some guidance on the correct way to fix this issue, beyond the quick fix Tom Jaeger posted? Michael Heath -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] constify tables in kernel/sysctl_check.c
>>> Eric W. Biederman <[EMAIL PROTECTED]> 21.12.07 00:05 >>> >"Jan Beulich" <[EMAIL PROTECTED]> writes: > >> Remains the question whether it is intended that many, perhaps even >> large, tables are compiled in without ever having a chance to get used, >> i.e. whether there shouldn't #ifdef CONFIG_xxx get added. > > >The constification looks good. The file should be compiled only when >we have sysctl support. We use those tables when we call >register_sysctl_table. Which we do a lot. I understand this. Nevertheless, the tables take 23k on 64-bits, and many of them are unused when certain subsystems aren't being built (and some are even architecture specific). The arlan tables are a particularly good example, but the netfilter ones are pretty big and probably not always used, too. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: almost daily Kernel oops with 2.6.23.9 - and now 2.6.23.11 as well
On Thu, 2007-12-20 at 19:14 +0100, Hemmann, Volker Armin wrote: > It is just.. I could be the hardware - but I should have seen the > same 'problem' with earlier kernels - and the 'almost daily oops' only > started with 2.6.23. Nonetheless, the oopsen _suggest_ hardware. If it were my box, I'd move ram modules as a first step. It costs about two minutes to eliminate that possibility, but you seem reluctant to take that step. Heck, I'd _hope_ it's something as simple bad ram, because otherwise, quest for stability could become a time consuming and/or expensive undertaking... If that didn't change anything, I'd go back and stress test a previously stable configuration to gain confidence in my hardware. If 'uhoh, not as stable as I thought' happened, and nothing is getting obviously hot [1], I'd pray that it's an electrically noisy power supply, because that's also easy and cheap. In any case, once I was very very confident that my hardware was indeed sound, I'd move on to an agonizingly tedious bisection, with no out of tree modules ever loaded, to narrow down when this memory corruption that nobody else appears to be hitting appeared. -Mike 1. Crappy heatsink compound can dry out and fracture, leaving hot chip under a relatively cool heatsink. This is exactly what I found when I disassembled my suddenly unstable under heavy load P4 box a while back. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3 -mm] kexec jump -v8 : access memory image of kexec_image
This patch adds a file in proc file system to access the loaded kexec_image, which may contains the memory image of kexeced system. This can be used to: - Communicate between original kernel and kexeced kernel through write to some pages in original kernel. - Communicate between original kernel and kexeced kernel through read memory image of kexeced kernel, amend the image, and reload the amended image. - Accelerate boot of kexeced kernel. If you have a memory image of kexeced kernel, you need not a normal boot process to jump to the kexeced kernel, just load the memory image, jump to the point where you leave last time in kexeced kernel. Signed-off-by: Huang Ying <[EMAIL PROTECTED]> --- fs/proc/Makefile |1 fs/proc/kimgcore.c| 277 ++ fs/proc/proc_misc.c |6 + include/linux/kexec.h |7 + kernel/kexec.c|5 5 files changed, 291 insertions(+), 5 deletions(-) --- /dev/null +++ b/fs/proc/kimgcore.c @@ -0,0 +1,277 @@ +/* + * fs/proc/kimgcore.c - Interface for accessing the loaded + * kexec_image, which may contains the memory image of kexeced system. + * Heavily borrowed from fs/proc/kcore.c + * + * Copyright (C) 2007, Intel Corp. + * Huang Ying <[EMAIL PROTECTED]> + * + * This file is released under the GPLv2 + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct proc_dir_entry *proc_root_kimgcore; + +static u32 kimgcore_size; + +static char *elfcorebuf; +static size_t elfcorebuf_sz; + +static void *buf_page; + +static ssize_t kimage_copy_to_user(struct kimage *image, char __user *buf, + unsigned long offset, size_t count) +{ + kimage_entry_t *ptr, entry; + unsigned long off = 0, offinp, trunk; + struct page *page; + void *vaddr; + + for_each_kimage_entry(image, ptr, entry) { + if (!(entry & IND_SOURCE)) + continue; + if (off + PAGE_SIZE > offset) { + offinp = offset - off; + if (count > PAGE_SIZE - offinp) + trunk = PAGE_SIZE - offinp; + else + trunk = count; + page = pfn_to_page(entry >> PAGE_SHIFT); + if (PageHighMem(page)) { + vaddr = kmap(page); + memcpy(buf_page, vaddr+offinp, trunk); + kunmap(page); + vaddr = buf_page; + } else + vaddr = __va(entry & PAGE_MASK) + offinp; + if (copy_to_user(buf, vaddr, trunk)) + return -EFAULT; + buf += trunk; + offset += trunk; + count -= trunk; + if (!count) + break; + } + off += PAGE_SIZE; + } + return count; +} + +static ssize_t kimage_copy_from_user(struct kimage *image, +const char __user *buf, +unsigned long offset, +size_t count) +{ + kimage_entry_t *ptr, entry; + unsigned long off = 0, offinp, trunk; + struct page *page; + void *vaddr; + + for_each_kimage_entry(image, ptr, entry) { + if (!(entry & IND_SOURCE)) + continue; + if (off + PAGE_SIZE > offset) { + offinp = offset - off; + if (count > PAGE_SIZE - offinp) + trunk = PAGE_SIZE - offinp; + else + trunk = count; + page = pfn_to_page(entry >> PAGE_SHIFT); + if (PageHighMem(page)) + vaddr = buf_page; + else + vaddr = __va(entry & PAGE_MASK) + offinp; + if (copy_from_user(vaddr, buf, trunk)) + return -EFAULT; + if (PageHighMem(page)) { + vaddr = kmap(page); + memcpy(vaddr+offinp, buf_page, trunk); + kunmap(page); + } + buf += trunk; + offset += trunk; + count -= trunk; + if (!count) + break; + } + off += PAGE_SIZE; + } + return count; +} + +static ssize_t read_kimgcore(struct file *file, char __user *buffer, +size_t buflen, loff_t *fpos) +{ + size_t acc = 0; + size_t tsz;
Re: [Jan Beulich] [PATCH] constify tables in kernel/sysctl_check.c
Thanks for catching this! >>> Dave Jones <[EMAIL PROTECTED]> 21.12.07 03:30 >>> On Thu, Dec 20, 2007 at 04:14:05PM -0700, Eric W. Biederman wrote: > Remains the question whether it is intended that many, perhaps even > large, tables are compiled in without ever having a chance to get used, > i.e. whether there shouldn't #ifdef CONFIG_xxx get added. > -static struct trans_ctl_table trans_net_ax25_param_table[] = { > +static const struct trans_ctl_table trans_net_ax25_table[] = { we lost the _param, which will cause a duplicate definition with .. > -static struct trans_ctl_table trans_net_ax25_table[] = { > +static const struct trans_ctl_table trans_net_ax25_table[] = { cut-n-paste thinko ? Dave -- http://www.codemonkey.org.uk -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3 -mm] kexec jump -v8 : kexec jump basic
This patch implements the functionality of jumping between the kexeced kernel and the original kernel. To support jumping between two kernels, before jumping to (executing) the new kernel and jumping back to the original kernel, the devices are put into quiescent state, and the state of devices and CPU is saved. After jumping back from kexeced kernel and jumping to the new kernel, the state of devices and CPU are restored accordingly. The devices/CPU state save/restore code of software suspend is called to implement corresponding function. To support jumping without reserving memory. One shadow backup page (source page) is allocated for each page used by new (kexeced) kernel (destination page). When do kexec_load, the image of new kernel is loaded into source pages, and before executing, the destination pages and the source pages are swapped, so the contents of destination pages are backupped. Before jumping to the new (kexeced) kernel and after jumping back to the original kernel, the destination pages and the source pages are swapped too. A jump back protocol for kexec is defined and documented. It is an extension to ordinary function calling protocol. So, the facility provided by this patch can be used to call ordinary C function in physical mode. A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to indicate that the loaded kernel image is used for jumping back. Signed-off-by: Huang Ying <[EMAIL PROTECTED]> --- Documentation/i386/jump_back_protocol.txt | 66 ++ arch/powerpc/kernel/machine_kexec.c |2 arch/ppc/kernel/machine_kexec.c |2 arch/sh/kernel/machine_kexec.c|2 arch/x86/kernel/machine_kexec_32.c| 39 +- arch/x86/kernel/machine_kexec_64.c|2 arch/x86/kernel/relocate_kernel_32.S | 194 ++ include/asm-x86/kexec_32.h| 34 - include/linux/kexec.h | 14 +- kernel/kexec.c| 65 +- kernel/power/Kconfig |2 kernel/sys.c | 35 +++-- 12 files changed, 403 insertions(+), 54 deletions(-) --- a/arch/x86/kernel/machine_kexec_32.c +++ b/arch/x86/kernel/machine_kexec_32.c @@ -20,6 +20,7 @@ #include #include #include +#include #define PAGE_ALIGNED __attribute__ ((__aligned__(PAGE_SIZE))) static u32 kexec_pgd[1024] PAGE_ALIGNED; @@ -83,10 +84,14 @@ static void load_segments(void) * reboot code buffer to allow us to avoid allocations * later. * - * Currently nothing. + * Turn off NX bit for control page. */ int machine_kexec_prepare(struct kimage *image) { + if (nx_enabled) { + change_page_attr(image->control_code_page, 1, PAGE_KERNEL_EXEC); + global_flush_tlb(); + } return 0; } @@ -96,25 +101,45 @@ int machine_kexec_prepare(struct kimage */ void machine_kexec_cleanup(struct kimage *image) { + if (nx_enabled) { + change_page_attr(image->control_code_page, 1, PAGE_KERNEL); + global_flush_tlb(); + } } /* * Do not allocate memory (or fail in any way) in machine_kexec(). * We are past the point of no return, committed to rebooting now. */ -NORET_TYPE void machine_kexec(struct kimage *image) +void machine_kexec(struct kimage *image) { unsigned long page_list[PAGES_NR]; void *control_page; + asmlinkage NORET_TYPE void + (*relocate_kernel_ptr)(unsigned long indirection_page, + unsigned long control_page, + unsigned long start_address, + unsigned int has_pae) ATTRIB_NORET; /* Interrupts aren't acceptable while we reboot */ local_irq_disable(); control_page = page_address(image->control_code_page); - memcpy(control_page, relocate_kernel, PAGE_SIZE); + memcpy(control_page, relocate_page, PAGE_SIZE/2); + KJUMP_MAGIC(control_page) = 0; + if (image->preserve_context) { + KJUMP_MAGIC(control_page) = KJUMP_MAGIC_NUMBER; + if (kexec_jump_save_cpu(control_page)) { + image->start = KJUMP_ENTRY(control_page); + return; + } + } + + relocate_kernel_ptr = control_page + + ((void *)relocate_kernel - (void *)relocate_page); page_list[PA_CONTROL_PAGE] = __pa(control_page); - page_list[VA_CONTROL_PAGE] = (unsigned long)relocate_kernel; + page_list[VA_CONTROL_PAGE] = (unsigned long)control_page; page_list[PA_PGD] = __pa(kexec_pgd); page_list[VA_PGD] = (unsigned long)kexec_pgd; #ifdef CONFIG_X86_PAE @@ -127,6 +152,7 @@ NORET_TYPE void machine_kexec(struct kim page_list[VA_PTE_0] = (unsigned long)kexec_pte0; page_list[PA_PTE_1] = __pa(kexec_pte1); page_list[VA_PTE_1] = (unsigned
[PATCH 2/3 -mm] kexec jump -v8 : add write support to oldmem device
This patch adds writing support for /dev/oldmem. This can be used to - Communicate between original kernel and kexeced kernel through write to some pages in original kernel. - Restore the memory contents of hibernated system in kexec based hibernation. Signed-off-by: Huang Ying <[EMAIL PROTECTED]> --- arch/x86/kernel/crash_dump_32.c | 27 +++ drivers/char/mem.c | 32 include/linux/crash_dump.h |2 ++ 3 files changed, 61 insertions(+) --- a/arch/x86/kernel/crash_dump_32.c +++ b/arch/x86/kernel/crash_dump_32.c @@ -59,6 +59,33 @@ ssize_t copy_oldmem_page(unsigned long p return csize; } +ssize_t write_oldmem_page(unsigned long pfn, const char *buf, + size_t csize, unsigned long offset, int userbuf) +{ + void *vaddr; + + if (!csize) + return 0; + + if (!userbuf) { + vaddr = kmap_atomic_pfn(pfn, KM_PTE0); + memcpy(vaddr + offset, buf, csize); + } else { + if (!kdump_buf_page) { + printk(KERN_WARNING "Kdump: Kdump buffer page not" + " allocated\n"); + return -EFAULT; + } + if (copy_from_user(kdump_buf_page, buf, csize)) + return -EFAULT; + vaddr = kmap_atomic_pfn(pfn, KM_PTE0); + memcpy(vaddr + offset, kdump_buf_page, csize); + } + kunmap_atomic(vaddr, KM_PTE0); + + return csize; +} + static int __init kdump_buf_page_init(void) { int ret = 0; --- a/include/linux/crash_dump.h +++ b/include/linux/crash_dump.h @@ -11,6 +11,8 @@ extern unsigned long long elfcorehdr_addr; extern ssize_t copy_oldmem_page(unsigned long, char *, size_t, unsigned long, int); +extern ssize_t write_oldmem_page(unsigned long, const char *, size_t, +unsigned long, int); extern const struct file_operations proc_vmcore_operations; extern struct proc_dir_entry *proc_vmcore; --- a/drivers/char/mem.c +++ b/drivers/char/mem.c @@ -348,6 +348,37 @@ static ssize_t read_oldmem(struct file * } return read; } + +/* + * Write memory corresponding to the old kernel. + */ +static ssize_t write_oldmem(struct file *file, const char __user *buf, + size_t count, loff_t *ppos) +{ + unsigned long pfn, offset; + size_t write = 0, csize; + int rc = 0; + + while (count) { + pfn = *ppos / PAGE_SIZE; + if (pfn > saved_max_pfn) + return write; + + offset = (unsigned long)(*ppos % PAGE_SIZE); + if (count > PAGE_SIZE - offset) + csize = PAGE_SIZE - offset; + else + csize = count; + rc = write_oldmem_page(pfn, buf, csize, offset, 1); + if (rc < 0) + return rc; + buf += csize; + *ppos += csize; + write += csize; + count -= csize; + } + return write; +} #endif extern long vread(char *buf, char *addr, unsigned long count); @@ -783,6 +814,7 @@ static const struct file_operations full #ifdef CONFIG_CRASH_DUMP static const struct file_operations oldmem_fops = { .read = read_oldmem, + .write = write_oldmem, .open = open_oldmem, }; #endif -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/3 -mm] kexec jump -v8
This patchset provides an enhancement to kexec/kdump. It implements the following features: - Backup/restore memory used both by the original kernel and the kexeced kernel. - Jumping between the original kernel and the kexeced kernel. - Read/write memory image of the kexeced kernel in the original kernel and write memory image of the original kernel in the kexeced kernel. This can be used as a communication method between the kexeced kernel and the original kernel. The features of this patchset can be used as follow: - Kernel/system debug through making system snapshot. You can make system snapshot, jump back, do some thing and make another system snapshot. - A simple hibernation implementation without ACPI support. You can kexec a hibernating kernel, save the memory image of original system and shutdown the system. When resuming, you boot a resuming kernel in memory range of kexeced kernel, restore the memory image of original system and jump back. - Cooperative multi-kernel/system. With kexec jump, you can switch between several kernels/systems quickly without boot process except the first time. This appears like swap a whole kernel/system out/in. - A general method to call program in physical mode. This can be used to invoke some BIOS code under Linux. - The basis of a full kexec based hibernation implementation with ACPI support. The full kexec based hibernation implementation is provided in another patchset named kexec based hibernation. Now, only the i386 architecture is supported. The patchset is based on Linux kernel 2.6.24-rc5-mm1, and has been tested on IBM T42 with ACPI on and off. The following user-space tools can be used with kexec jump. 1. kexec-tools needs to be patched to support kexec jump. The patches and the precompiled kexec can be download from the following URL: source: http://khibernation.sourceforge.net/download/release_v8/kexec-tools/kexec-tools-src_git_kh8.tar.bz2 patches: http://khibernation.sourceforge.net/download/release_v8/kexec-tools/kexec-tools-patches_git_kh8.tar.bz2 binary: http://khibernation.sourceforge.net/download/release_v8/kexec-tools/kexec_git_kh8 2. makedumpfile with patches are used as memory image saving tool, it can exclude free pages from original kernel memory image file. The patches and the precompiled makedumpfile can be download from the following URL: source: http://khibernation.sourceforge.net/download/release_v8/makedumpfile/makedumpfile-src_cvs_kh8.tar.bz2 patches: http://khibernation.sourceforge.net/download/release_v8/makedumpfile/makedumpfile-patches_cvs_kh8.tar.bz2 binary: http://khibernation.sourceforge.net/download/release_v8/makedumpfile/makedumpfile_cvs_kh8 3. A simplest memory image restoring tool named "krestore" is implemented. It can be downloaded from the following URL: source: http://khibernation.sourceforge.net/download/release_v8/krestore/krestore-src_cvs_kh8.tar.bz2 binary: http://khibernation.sourceforge.net/download/release_v8/krestore/krestore_cvs_kh8 An initramfs image can be used as the root file system of kexeced kernel. An initramfs image built with "BuildRoot" can be downloaded from the following URL: initramfs image: http://khibernation.sourceforge.net/download/release_v8/initramfs/rootfs_cvs_kh8.gz All user space tools above are included in the initramfs image. Usage example of jumping between original and kexeced kernel: 1. Compile and install patched kernel with following options selected: CONFIG_X86_32=y CONFIG_RELOCATABLE=y CONFIG_KEXEC=y CONFIG_CRASH_DUMP=y CONFIG_PM=y 2. Build an initramfs image contains kexec-tool, or download the pre-built initramfs image, called rootfs.gz in following text. 3. Boot kernel compiled in step 1. 4. Load kernel compiled in step 1 with /sbin/kexec. If You want to use "krestore" tool, the --elf64-core-headers should be specified in command line of /sbin/kexec. The shell command line can be as follow: /sbin/kexec --load-jump-back /boot/bzImage --mem-min=0x10 --mem-max=0xff --elf64-core-headers --initrd=rootfs.gz 5. Boot the kexeced kernel with following shell command line: /sbin/kexec -e 6. The kexeced kernel will boot as normal kexec. In kexeced kernel the memory image of original kernel can read via /proc/vmcore or /dev/oldmem, and can be written via /dev/oldmem. You can save/restore/modify it as you want to. 7. Prepare jumping back from kexeced kernel with following shell command lines: jump_back_entry=`cat /proc/cmdline | tr ' ' '\n' | grep kexec_jump_back_entry | cut -d '='` /sbin/kexec --load-jump-back-helper=$jump_back_entry 8. Jump back to the original kernel with following shell command line: /sbin/kexec -e 9. Now, you are in the original kernel again. You can read/write the memory image of kexeced kernel via /proc/kimgcore. 10. You can jump between the original kernel and
Re: [PATCH 0/4] add task handling notifier
>Yes, but why export variables? Wouldn't it be better to export >an API? > >That simplifies the callers (they all pass "current" as task >and "task_notifier_list" as arguments). > >It also prevents exposing internal variables (notifier lists >ARE internal variables) to modules. > >What do you think? Would be a simple change if the concept itself is generally welcome. Will first see whether I get other comments requiring re-work. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] w1-gpio: Add GPIO w1 bus master driver
Add a GPIO 1-wire bus master driver. The driver used the GPIO API to control the wire and the GPIO pin can be specified using platform data similar to i2c-gpio. The driver was tested with AT91SAM9260 + DS2401. Signed-off-by: Ville Syrjala <[EMAIL PROTECTED]> --- Documentation/w1/masters/00-INDEX |2 + Documentation/w1/masters/w1-gpio | 33 drivers/w1/masters/Kconfig| 10 drivers/w1/masters/Makefile |1 + drivers/w1/masters/w1-gpio.c | 100 + include/linux/w1-gpio.h | 21 6 files changed, 167 insertions(+), 0 deletions(-) create mode 100644 Documentation/w1/masters/w1-gpio create mode 100644 drivers/w1/masters/w1-gpio.c create mode 100644 include/linux/w1-gpio.h diff --git a/Documentation/w1/masters/00-INDEX b/Documentation/w1/masters/00-INDEX index 752613c..7b0ceaa 100644 --- a/Documentation/w1/masters/00-INDEX +++ b/Documentation/w1/masters/00-INDEX @@ -4,3 +4,5 @@ ds2482 - The Maxim/Dallas Semiconductor DS2482 provides 1-wire busses. ds2490 - The Maxim/Dallas Semiconductor DS2490 builds USB <-> W1 bridges. +w1-gpio + - GPIO 1-wire bus master driver. diff --git a/Documentation/w1/masters/w1-gpio b/Documentation/w1/masters/w1-gpio new file mode 100644 index 000..c927139 --- /dev/null +++ b/Documentation/w1/masters/w1-gpio @@ -0,0 +1,33 @@ +Kernel driver w1-gpio += + +Author: Ville Syrjala <[EMAIL PROTECTED]> + + +Description +--- + +GPIO 1-wire bus master driver. The driver uses the GPIO API to control the +wire and the GPIO pin can be specified using platform data. The GPIO pin +must be configured as open-drain. + + +Example (mach-at91) +--- + +#include + +static struct w1_gpio_platform_data foo_w1_gpio_pdata = { + .pin = AT91_PIN_PB20, +}; + +static struct platform_device foo_w1_device = { + .name = "w1-gpio", + .id = -1, + .dev.platform_data = _w1_gpio_pdata, +}; + +... + at91_set_GPIO_periph(foo_w1_gpio_pdata.pin, 1); + at91_set_multi_drive(foo_w1_gpio_pdata.pin, 1); + platform_device_register(_w1_device); diff --git a/drivers/w1/masters/Kconfig b/drivers/w1/masters/Kconfig index 8236d44..c449309 100644 --- a/drivers/w1/masters/Kconfig +++ b/drivers/w1/masters/Kconfig @@ -42,5 +42,15 @@ config W1_MASTER_DS1WM in HP iPAQ devices like h5xxx, h2200, and ASIC3-based like hx4700. +config W1_MASTER_GPIO + tristate "GPIO 1-wire busmaster" + depends on GENERIC_GPIO + help + Say Y here if you want to communicate with your 1-wire devices using + GPIO pins. This driver uses the GPIO API to control the wire. + + This support is also available as a module. If so, the module + will be called w1-gpio.ko. + endmenu diff --git a/drivers/w1/masters/Makefile b/drivers/w1/masters/Makefile index 11551b3..1420b5b 100644 --- a/drivers/w1/masters/Makefile +++ b/drivers/w1/masters/Makefile @@ -6,3 +6,4 @@ obj-$(CONFIG_W1_MASTER_MATROX) += matrox_w1.o obj-$(CONFIG_W1_MASTER_DS2490) += ds2490.o obj-$(CONFIG_W1_MASTER_DS2482) += ds2482.o obj-$(CONFIG_W1_MASTER_DS1WM) += ds1wm.o +obj-$(CONFIG_W1_MASTER_GPIO) += w1-gpio.o diff --git a/drivers/w1/masters/w1-gpio.c b/drivers/w1/masters/w1-gpio.c new file mode 100644 index 000..c5327df --- /dev/null +++ b/drivers/w1/masters/w1-gpio.c @@ -0,0 +1,100 @@ +/* + * w1-gpio - GPIO w1 bus master driver + * + * Copyright (C) 2007 Ville Syrjala <[EMAIL PROTECTED]> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation. + */ + +#include +#include +#include +#include + +#include "../w1.h" +#include "../w1_int.h" + +#include + +static void w1_gpio_write_bit(void *data, u8 bit) +{ + struct w1_gpio_platform_data *pdata = data; + + gpio_set_value(pdata->pin, bit); +} + +static u8 w1_gpio_read_bit(void *data) +{ + struct w1_gpio_platform_data *pdata = data; + + return gpio_get_value(pdata->pin); +} + +static int __init w1_gpio_probe(struct platform_device *pdev) +{ + struct w1_bus_master *master; + struct w1_gpio_platform_data *pdata; + int err; + + pdata = pdev->dev.platform_data; + if (!pdata) + return -ENXIO; + + master = kzalloc(sizeof *master, GFP_KERNEL); + if (!master) + return -ENOMEM; + + gpio_direction_output(pdata->pin, 1); + + master->data = pdata; + master->read_bit = _gpio_read_bit; + master->write_bit = _gpio_write_bit; + + err = w1_add_master_device(master); + if (err) { + kfree(master); + return err; + } + + platform_set_drvdata(pdev, master); + + return 0; +} + +static int
Re: driver spin lock and files_lock deadlock question
On Thu, Dec 20, 2007 at 10:59:20PM -0800, Srinivas Kommu wrote: > It seems this kind of a deadlock can happen with any kernel lock, not > just files_lock. What's the driver's mistake here? Is it wrong to call > remove_proc_entry() while holding another lock? What is the right thing > to do? remove_proc_entry() is a blocking function... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm 18/43] powerpc compat_binfmt_elf
On Thu, Dec 20, 2007 at 03:58:16AM -0800, Roland McGrath wrote: > +obj-$(CONFIG_PPC64) += ../../../fs/compat_binfmt_elf.o Building files from another directory is nasty. Please add a CONFIG_BINFMT_COMPAT_ELF so we can simply build it in fs/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm 09/43] compat_sys_ptrace
On Thu, Dec 20, 2007 at 03:55:51AM -0800, Roland McGrath wrote: > This adds a generic definition of compat_sys_ptrace that calls > compat_arch_ptrace, parallel to sys_ptrace/arch_ptrace. Some > machines needing this already define a function by that name. > The new generic function is defined only on machines that > put #define __ARCH_WANT_COMPAT_SYS_PTRACE into asm/ptrace.h. Nice, we should have unified the compat ptrace code long ago. Any chance you could make the ifdef symetric to the native ptrace where an arch defines a symbol if it has it's own ptrace? Also when prototyping something like this I was wondering whether we really want a separate compat function. Lots of the ptrace requests mostly depend on the target processes abi, not the ptrace caller, so maybe doing it like s390 and handle both in the same function might actually be cleaner. Anyway, that's probably something to worry about later one the arch-specific compat ptrace implementations are gone. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
driver spin lock and files_lock deadlock question
I have a driver that needs to be SMP-safe. It also has some code hooking into the net_rx_action softirq. So it takes a spinlock and disables the local bottom-half around its critical sections: spin_lock_bh(_lock). Now, I'm facing a deadlock under a particular sequence involving the files_lock: 1. CPU 0 takes driver_lock and then calls remove_proc_entry() which is hanging at spin_lock(_lock). 2. CPU 1 was in fput() which took files_lock; the softirq comes in at this point and attempts to take driver_lock and hangs forever. It seems this kind of a deadlock can happen with any kernel lock, not just files_lock. What's the driver's mistake here? Is it wrong to call remove_proc_entry() while holding another lock? What is the right thing to do? This is with the 2.4 kernel, BTW. thanks srini -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm 01/43] user_regset header
On Thu, Dec 20, 2007 at 03:53:57AM -0800, Roland McGrath wrote: > +/* > + * User-mode machine state access > + * > + * Copyright (C) 2007 Red Hat, Inc. All rights reserved. > + * > + * This copyrighted material is made available to anyone wishing to use, > + * modify, copy, or redistribute it subject to the terms and conditions > + * of the GNU General Public License v.2. > + * > + * Red Hat Author: Roland McGrath. What's a Red Hat Author? Sorry for the nitpicking, but why don't you just use Author like everyone else? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: OOPS: 2.6.24-rc5-mm1 -- EIP is at r_show+0x2a/0x70 -- (triggered by "cat /proc/iomem" AFTER suspend-to-disk/resume)
On Fri, 21 Dec 2007 00:58:19 -0500 "Miles Lane" <[EMAIL PROTECTED]> wrote: > On Dec 20, 2007 12:32 PM, Andrew Morton <[EMAIL PROTECTED]> wrote: > > > On Thu, 20 Dec 2007 08:38:03 -0500 Miles Lane <[EMAIL PROTECTED]> > > wrote: > > > > > On further investigation, "cat /proc/iomem" does not trigger the stack > > > trace until after a suspend-to-disk/resume cycle has occurred. > > > > I still can't reproduce this. > > > > Could you please try this? > > > > - cat /proc/iomem > > - suspend/resume > > - do > > > > while read i > > do > > echo $i > > sleep 1 > > done < /proc/iomem > > > > then, with luck, we'll be able to work out which /proc/iomem record > > immediately precedes the corrupted one. > > > > [EMAIL PROTECTED]:~$ cat > test.sh > while read i > do > echo $i > sleep 1 > done < /proc/iomem > ^C > [EMAIL PROTECTED]:~$ sh test.sh > -0009f7ff : System RAM > 0009f800-0009 : reserved > 000a-000b : Video RAM area > 000c-000c7fff : Video ROM > 000f-000f : System ROM > 0010-7f68 : System RAM > 0010-0039e4b7 : Kernel code > 0039e4b8-004f0983 : Kernel data > 00553000-007ecdfb : Kernel bss > 7f69-7f698fff : ACPI Tables > 7f699000-7f6f : ACPI Non-volatile Storage > 7f70-7fff : reserved > 8800-8bff : PCI CardBus #05 > 8c00-8fff : PCI CardBus #05 > Segmentation fault > > How do I determine what comes next? > By comparing it with the /proc/iomem from prior to suspending the machine. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] misc: Removal of final callers using fastcall
On Thu, 2007-12-20 at 18:30 -0800, Andrew Morton wrote: > On Wed, 12 Dec 2007 15:38:26 -0800 Harvey Harrison <[EMAIL PROTECTED]> wrote: > > > Andrew, I'm not sure who is best to hit with these final dribs and > > drabs removing fastcall. Once all of these have hit Linus' tree > > I will send a final patch deleting the include/linux/linkage.h > > definitions as well as any remaining occurances. > > Yes, that's a good approach, thanks. Wait until the tree is fastcall-clean > and then kill the definition(s). > > I think I skipped rather a lot of remove-fastcall patches because a) > suitable maintainers were cc'ed and b) I was going through a > suicidal-over-bug-reports phase. > > Please keep them coming - I've always disliked fastcall. Once I see these have hit the main tree, I'll send patch getting any more that have snuck in for the next rc. After that there should be few enough left that I can send you a small patch for the next rc with the definition removal as well. I'll keep on top of these. Harvey -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: OOPS: 2.6.24-rc5-mm1 -- EIP is at r_show+0x2a/0x70 -- (triggered by "cat /proc/iomem" AFTER suspend-to-disk/resume)
Resending... Curse GMail's HTML messages! On Dec 21, 2007 12:58 AM, Miles Lane <[EMAIL PROTECTED]> wrote: > > On Dec 20, 2007 12:32 PM, Andrew Morton <[EMAIL PROTECTED]> wrote: > > > On Thu, 20 Dec 2007 08:38:03 -0500 Miles Lane <[EMAIL PROTECTED]> wrote: > > > > > On further investigation, "cat /proc/iomem" does not trigger the stack > > > trace until after a suspend-to-disk/resume cycle has occurred. > > > > I still can't reproduce this. > > > > Could you please try this? > > > > - cat /proc/iomem > > - suspend/resume > > - do > > > > while read i > > do > > echo $i > > sleep 1 > > done < /proc/iomem > > > > then, with luck, we'll be able to work out which /proc/iomem record > > immediately precedes the corrupted one. > > > > [EMAIL PROTECTED]:~$ cat > test.sh > > while read i > do > echo $i > sleep 1 > done < /proc/iomem > ^C > [EMAIL PROTECTED]:~$ sh test.sh > -0009f7ff : System RAM > 0009f800-0009 : reserved > 000a-000b : Video RAM area > 000c-000c7fff : Video ROM > 000f-000f : System ROM > 0010-7f68 : System RAM > 0010-0039e4b7 : Kernel code > 0039e4b8-004f0983 : Kernel data > 00553000-007ecdfb : Kernel bss > 7f69-7f698fff : ACPI Tables > 7f699000-7f6f : ACPI Non-volatile Storage > 7f70-7fff : reserved > 8800-8bff : PCI CardBus #05 > 8c00-8fff : PCI CardBus #05 > Segmentation fault > > How do I determine what comes next? > > Thanks, > Miles > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: After many hours all outbound connections get stuck in SYN_SENT
On Dec 20 2007 23:05, Ilpo Järvinen wrote: >> >> Given the fact that I've had this problem for so long, over a variety >> of networking hardware vendors and colo-facilities, this really sounds >> good to me. It will be challenging for me to justify a kernel core >> dump, but a simple patch to dump the Sack data would be do-able. > >If your symptoms really are: SYNs leaving (if they show up in tcpdump, for >sure they've left TCP code already) and SYN-ACK not showing up even in >something as early as in tcpdump (for sure TCP side code didn't execute at >that point yet), there's very little change that Linux' TCP code has some >bug in it, only things that do something in such scenario are the SYN >generation and retransmitting SYNs (and those are trivially verifiable >from tcpdump). > Take a machine, put two interfaces in it, configure as bridge (br0 over eth0 and eth1 without any assigned ip addresses), put it between end node and the cisco. tcpdump there, which should give an unbiased view wrt. endnode/cisco. Then perhaps, also configure such a network listening bridge on the other side of the cisco, e.g. on the link to the internet and watch that. Compare the two tcpdumpds and see if sack got trashed. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 05/24] Text Edit Lock - Architecture Independent Code
> > === > > --- linux-2.6-lttng.orig/kernel/kprobes.c 2007-12-12 > > 18:10:32.0 -0500 > > +++ linux-2.6-lttng/kernel/kprobes.c2007-12-12 > > 18:10:34.0 -0500 > > @@ -644,7 +644,9 @@ valid_p: > > list_del_rcu(>list); > > kfree(old_p); > > } > > + mutex_lock(_mutex); > > arch_remove_kprobe(p); > > + mutex_unlock(_mutex); > > } else { > > mutex_lock(_mutex); > > if (p->break_handler) > > I think "mutex_lock" and "mutex_unlock" shoud be in architecture code. > In "__register_kprobe" funtion, its implement > "arch_prepare_kprobe" and > "arch_arm_kprobe" is also depended on arch. So the remove > implement is not > the same on the different architecture code. > > Maybe it doesn't need the mutex_lock in "arch_remove_kprobe" > on some embeded > system chips if linux can support the other embeded system > chips in future. Could we insert the "mutex_lock" and "mutex_unlock" into "free_insn_slot" instead of architecture code? modify as follows: void __kprobes free_insn_slot(kprobe_opcode_t * slot, int dirty) { struct kprobe_insn_page *kip; struct hlist_node *pos; + mutex_lock(_mutex); hlist_for_each_entry(kip, pos, _insn_pages, hlist) { if (kip->insns <= slot && slot < kip->insns + (INSNS_PER_PAGE * MAX_INSN_SIZE)) { int i = (slot - kip->insns) / MAX_INSN_SIZE; if (dirty) { kip->slot_used[i] = SLOT_DIRTY; kip->ngarbage++; } else { collect_one_slot(kip, i); } break; } } if (dirty && ++kprobe_garbage_slots > INSNS_PER_PAGE) collect_garbage_slots(); + mutex_unlock(_mutex); } > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of zhangxiliang > Sent: Friday, December 21, 2007 1:19 PM > To: 'Mathieu Desnoyers'; [EMAIL PROTECTED]; 'Ingo > Molnar'; linux-kernel@vger.kernel.org > Cc: 'Andi Kleen' > Subject: RE: [patch 05/24] Text Edit Lock - Architecture > Independent Code > > hello, >I have some questions for your patches. > > > Paravirt and alternatives are always done when SMP is > > inactive, so there is no > > need to use locks. > > > -#ifndef CONFIG_KPROBES > > -#ifdef CONFIG_HOTPLUG_CPU > > - /* It must still be possible to apply SMP alternatives. */ > > - if (num_possible_cpus() <= 1) > > -#endif > > - { > > - change_page_attr(virt_to_page(start), > > -size >> PAGE_SHIFT, PAGE_KERNEL_RX); > > - printk("Write protecting the kernel text: > > %luk\n", size >> 10); > > - } > > -#endif > > + change_page_attr(virt_to_page(start), > > + size >> PAGE_SHIFT, PAGE_KERNEL_RX); > > + printk(KERN_INFO "Write protecting the kernel text: %luk\n", > > + size >> 10); > > + > > Why "mark_rodata_ro" doesn't consider smp instance? Maybe it > will be appied in > future. > > > > === > > --- linux-2.6-lttng.orig/kernel/kprobes.c 2007-12-12 > > 18:10:32.0 -0500 > > +++ linux-2.6-lttng/kernel/kprobes.c2007-12-12 > > 18:10:34.0 -0500 > > @@ -644,7 +644,9 @@ valid_p: > > list_del_rcu(>list); > > kfree(old_p); > > } > > + mutex_lock(_mutex); > > arch_remove_kprobe(p); > > + mutex_unlock(_mutex); > > } else { > > mutex_lock(_mutex); > > if (p->break_handler) > > I think "mutex_lock" and "mutex_unlock" shoud be in architecture code. > In "__register_kprobe" funtion, its implement > "arch_prepare_kprobe" and > "arch_arm_kprobe" is also depended on arch. So the remove > implement is not > the same on the different architecture code. > > Maybe it doesn't need the mutex_lock in "arch_remove_kprobe" > on some embeded > system chips if linux can support the other embeded system > chips in future. > > > > -Original Message- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of > > Mathieu Desnoyers > > Sent: Friday, December 21, 2007 9:55 AM > > To: [EMAIL PROTECTED]; Ingo Molnar; > > linux-kernel@vger.kernel.org > > Cc: Mathieu Desnoyers; Andi Kleen > > Subject: [patch 05/24] Text Edit Lock - Architecture > Independent Code > > > > This is an architecture independant synchronization around > kernel text > > modifications through use of a global mutex. > > > > A mutex has been chosen so that kprobes, the main user of > > this, can sleep during > > memory allocation between the memory read of the instructions > > it must replace > > and the memory write of the
i2c block read on an SMBus
I am trying to do an i2c block read using a call like rc = i2c_smbus_xfer(g_i2c_adp, buf[0], 0x0, I2C_SMBUS_READ, 0x0, I2C_SMBUS_I2C_BLOCK_DATA, ); and the logs show me that this hits the else part of this if condition in i801_block_transaction function in file i2c-i801.c. (of kernel version 2.6.23.11) if (command == I2C_SMBUS_I2C_BLOCK_DATA) { if (read_write == I2C_SMBUS_WRITE) { /* set I2C_EN bit in configuration register */ pci_read_config_byte(I801_dev, SMBHSTCFG, ); pci_write_config_byte(I801_dev, SMBHSTCFG, hostc | SMBHSTCFG_I2C_EN); } else { dev_err(_dev->dev, "I2C_SMBUS_I2C_BLOCK_READ not DB!\n"); return -1; } } some time ago when I was doing a web search i seem to have run into a patch which allows doing a i2c block read on SMBus. Is there a patch for this? ( Output from my lspci: 00:1f.3 SMBus: Intel Corporation 6300ESB SMBus Controller (rev 02)) Looking at the documentation for 6300ESB SMBus Controller it seems that the only I2C read transaction supported is a block read. All the other read transaction are SMBus type. Why is the i2c read block not supported in the driver? Thanks in advance for all the input. Please CC me on th replies as I am not subscribed to the list. Thx, Venkat -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: iommu dma mapping alignment requirements
BTW. I need to know urgently what HW is broken by this Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] pci: Remove pci_enable_device_bars()
On Thu, 2007-12-20 at 21:11 -0800, Greg KH wrote: > On Fri, Dec 21, 2007 at 03:47:28PM +1100, Benjamin Herrenschmidt wrote: > > pci: Remove pci_enable_device_bars() fix for qla > > > > The previous patch missed one occurence of pci_enable_device_bars() > > in the qla2xxx driver. This fixes it. > > Should I just merge this with your 2/3 patch so everything is sane? Sure. Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] Add GD-Rom support to the SEGA Dreamcast
On Thu, Dec 20, 2007 at 09:53:54PM +, Adrian McMenamin wrote: > On 16/12/2007, Paul Mundt <[EMAIL PROTECTED]> wrote: > > Also, __devinit/__devexit annotations? > > > > Is there any difference between __init and __devint? Yes. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ecryptfs: check for existing key_tfm at mount time
Jeff Moyer pointed out that a mount; umount loop of ecryptfs, with the same cipher & other mount options, created a new ecryptfs_key_tfm_cache item each time, and the cache could grow quite large this way. Looking at this with mhalcrow, we saw that ecryptfs_parse_options() unconditionally called ecryptfs_add_new_key_tfm(), which is what was adding these items. Refactor ecryptfs_get_tfm_and_mutex_for_cipher_name() to create a new helper function, ecryptfs_tfm_exists(), which checks for the cipher on the cached key_tfm_list, and sets a pointer to it if it exists. This can then be called from ecryptfs_parse_options(), and new key_tfm's can be added only when a cached one is not found. Signed-off-by: Eric Sandeen <[EMAIL PROTECTED]> --- Index: linux-2.6.24-rc3/fs/ecryptfs/crypto.c === --- linux-2.6.24-rc3.orig/fs/ecryptfs/crypto.c +++ linux-2.6.24-rc3/fs/ecryptfs/crypto.c @@ -1868,6 +1868,33 @@ out: return rc; } +/** + * ecryptfs_tfm_exists - Search for existing tfm for cipher_name. + * @cipher_name: the name of the cipher to search for + * @key_tfm: set to corresponding tfm if found + * + * Returns 1 if found, with key_tfm set + * Returns 0 if not found, key_tfm set to NULL + */ +int ecryptfs_tfm_exists(char *cipher_name, struct ecryptfs_key_tfm **key_tfm) +{ + struct ecryptfs_key_tfm *tmp_key_tfm; + + mutex_lock(_tfm_list_mutex); + list_for_each_entry(tmp_key_tfm, _tfm_list, key_tfm_list) { + if (strcmp(tmp_key_tfm->cipher_name, cipher_name) == 0) { + mutex_unlock(_tfm_list_mutex); + if (key_tfm) + (*key_tfm) = tmp_key_tfm; + return 1; + } + } + mutex_unlock(_tfm_list_mutex); + if (key_tfm) + (*key_tfm) = NULL; + return 0; +} + int ecryptfs_get_tfm_and_mutex_for_cipher_name(struct crypto_blkcipher **tfm, struct mutex **tfm_mutex, char *cipher_name) @@ -1877,22 +1904,15 @@ int ecryptfs_get_tfm_and_mutex_for_ciphe (*tfm) = NULL; (*tfm_mutex) = NULL; - mutex_lock(_tfm_list_mutex); - list_for_each_entry(key_tfm, _tfm_list, key_tfm_list) { - if (strcmp(key_tfm->cipher_name, cipher_name) == 0) { - (*tfm) = key_tfm->key_tfm; - (*tfm_mutex) = _tfm->key_tfm_mutex; - mutex_unlock(_tfm_list_mutex); + + if (!ecryptfs_tfm_exists(cipher_name, _tfm)) { + rc = ecryptfs_add_new_key_tfm(_tfm, cipher_name, 0); + if (rc) { + printk(KERN_ERR "Error adding new key_tfm to list; " + "rc = [%d]\n", rc); goto out; } } - mutex_unlock(_tfm_list_mutex); - rc = ecryptfs_add_new_key_tfm(_tfm, cipher_name, 0); - if (rc) { - printk(KERN_ERR "Error adding new key_tfm to list; rc = [%d]\n", - rc); - goto out; - } (*tfm) = key_tfm->key_tfm; (*tfm_mutex) = _tfm->key_tfm_mutex; out: Index: linux-2.6.24-rc3/fs/ecryptfs/ecryptfs_kernel.h === --- linux-2.6.24-rc3.orig/fs/ecryptfs/ecryptfs_kernel.h +++ linux-2.6.24-rc3/fs/ecryptfs/ecryptfs_kernel.h @@ -623,6 +623,7 @@ ecryptfs_add_new_key_tfm(struct ecryptfs size_t key_size); int ecryptfs_init_crypto(void); int ecryptfs_destroy_crypto(void); +int ecryptfs_tfm_exists(char *cipher_name, struct ecryptfs_key_tfm **key_tfm); int ecryptfs_get_tfm_and_mutex_for_cipher_name(struct crypto_blkcipher **tfm, struct mutex **tfm_mutex, char *cipher_name); Index: linux-2.6.24-rc3/fs/ecryptfs/main.c === --- linux-2.6.24-rc3.orig/fs/ecryptfs/main.c +++ linux-2.6.24-rc3/fs/ecryptfs/main.c @@ -410,9 +410,11 @@ static int ecryptfs_parse_options(struct if (!cipher_key_bytes_set) { mount_crypt_stat->global_default_cipher_key_size = 0; } - rc = ecryptfs_add_new_key_tfm( - NULL, mount_crypt_stat->global_default_cipher_name, - mount_crypt_stat->global_default_cipher_key_size); + if (!ecryptfs_tfm_exists(mount_crypt_stat->global_default_cipher_name, +NULL)) + rc = ecryptfs_add_new_key_tfm( + NULL, mount_crypt_stat->global_default_cipher_name, + mount_crypt_stat->global_default_cipher_key_size); if (rc) { printk(KERN_ERR "Error attempting to initialize cipher with " "name = [%s] and key
RE: [patch 05/24] Text Edit Lock - Architecture Independent Code
hello, I have some questions for your patches. > Paravirt and alternatives are always done when SMP is > inactive, so there is no > need to use locks. > -#ifndef CONFIG_KPROBES > -#ifdef CONFIG_HOTPLUG_CPU > - /* It must still be possible to apply SMP alternatives. */ > - if (num_possible_cpus() <= 1) > -#endif > - { > - change_page_attr(virt_to_page(start), > - size >> PAGE_SHIFT, PAGE_KERNEL_RX); > - printk("Write protecting the kernel text: > %luk\n", size >> 10); > - } > -#endif > + change_page_attr(virt_to_page(start), > + size >> PAGE_SHIFT, PAGE_KERNEL_RX); > + printk(KERN_INFO "Write protecting the kernel text: %luk\n", > + size >> 10); > + Why "mark_rodata_ro" doesn't consider smp instance? Maybe it will be appied in future. > === > --- linux-2.6-lttng.orig/kernel/kprobes.c 2007-12-12 > 18:10:32.0 -0500 > +++ linux-2.6-lttng/kernel/kprobes.c 2007-12-12 > 18:10:34.0 -0500 > @@ -644,7 +644,9 @@ valid_p: > list_del_rcu(>list); > kfree(old_p); > } > + mutex_lock(_mutex); > arch_remove_kprobe(p); > + mutex_unlock(_mutex); > } else { > mutex_lock(_mutex); > if (p->break_handler) I think "mutex_lock" and "mutex_unlock" shoud be in architecture code. In "__register_kprobe" funtion, its implement "arch_prepare_kprobe" and "arch_arm_kprobe" is also depended on arch. So the remove implement is not the same on the different architecture code. Maybe it doesn't need the mutex_lock in "arch_remove_kprobe" on some embeded system chips if linux can support the other embeded system chips in future. > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Mathieu Desnoyers > Sent: Friday, December 21, 2007 9:55 AM > To: [EMAIL PROTECTED]; Ingo Molnar; > linux-kernel@vger.kernel.org > Cc: Mathieu Desnoyers; Andi Kleen > Subject: [patch 05/24] Text Edit Lock - Architecture Independent Code > > This is an architecture independant synchronization around kernel text > modifications through use of a global mutex. > > A mutex has been chosen so that kprobes, the main user of > this, can sleep during > memory allocation between the memory read of the instructions > it must replace > and the memory write of the breakpoint. > > Other user of this interface: immediate values. > > Paravirt and alternatives are always done when SMP is > inactive, so there is no > need to use locks. > > Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> > CC: Andi Kleen <[EMAIL PROTECTED]> > --- > include/linux/memory.h |7 +++ > mm/memory.c| 34 ++ > 2 files changed, 41 insertions(+) > > Index: linux-2.6-lttng/include/linux/memory.h > === > --- linux-2.6-lttng.orig/include/linux/memory.h> > 2007-11-07 11:11:26.0 -0500 > +++ linux-2.6-lttng/include/linux/memory.h2007-11-07 > 11:13:48.0 -0500 > @@ -93,4 +93,11 @@ extern int memory_notify(unsigned long v > #define hotplug_memory_notifier(fn, pri) do { } while (0) > #endif > > +/* > + * Take and release the kernel text modification lock, used > for code patching. > + * Users of this lock can sleep. > + */ > +extern void kernel_text_lock(void); > +extern void kernel_text_unlock(void); > + > #endif /* _LINUX_MEMORY_H_ */ > Index: linux-2.6-lttng/mm/memory.c > === > --- linux-2.6-lttng.orig/mm/memory.c 2007-11-07 > 11:12:33.0 -0500 > +++ linux-2.6-lttng/mm/memory.c 2007-11-07 > 11:14:25.0 -0500 > @@ -50,6 +50,8 @@ > #include > #include > #include > +#include > +#include > > #include > #include > @@ -84,6 +86,12 @@ EXPORT_SYMBOL(high_memory); > > int randomize_va_space __read_mostly = 1; > > +/* > + * mutex protecting text section modification (dynamic code > patching). > + * some users need to sleep (allocating memory...) while > they hold this lock. > + */ > +static DEFINE_MUTEX(text_mutex); > + > static int __init disable_randmaps(char *s) > { > randomize_va_space = 0; > @@ -2748,3 +2756,29 @@ int access_process_vm(struct task_struct > > return buf - old_buf; > } > + > +/** > + * kernel_text_lock - Take the kernel text modification lock > + * > + * Insures mutual write exclusion of kernel and modules text > live text > + * modification. Should be used for code patching. > + * Users of this lock can sleep. > + */ > +void __kprobes kernel_text_lock(void) > +{ > + mutex_lock(_mutex); > +} > +EXPORT_SYMBOL_GPL(kernel_text_lock); > + > +/** > + * kernel_text_unlock - Release the kernel text modification lock > + * > + * Insures mutual write exclusion of kernel and
Re: [PATCH 3/3] pci: Remove pci_enable_device_bars()
On Fri, Dec 21, 2007 at 03:47:28PM +1100, Benjamin Herrenschmidt wrote: > pci: Remove pci_enable_device_bars() fix for qla > > The previous patch missed one occurence of pci_enable_device_bars() > in the qla2xxx driver. This fixes it. Should I just merge this with your 2/3 patch so everything is sane? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.24-rc6
On Thu, 20 Dec 2007, Linus Torvalds wrote: > > And here's the git patch to avoid this optimization when there is > context. Actually, the code to finding one '\n' is still needed to avoid the (pathological) case of getting a "\No newline", so scrap that one which was too aggressive, and use this (simpler) one instead. Not that it matters in real life, since nobody uses -U0, and "git blame" won't care. But let's get it right anyway ;) This whole function has had more bugs than it has lines. Linus --- xdiff-interface.c |7 +-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/xdiff-interface.c b/xdiff-interface.c index 9ee877c..711029e 100644 --- a/xdiff-interface.c +++ b/xdiff-interface.c @@ -115,15 +115,18 @@ static void trim_common_tail(mmfile_t *a, mmfile_t *b, long ctx) char *bp = b->ptr + b->size; long smaller = (a->size < b->size) ? a->size : b->size; + if (ctx) + return; + while (blk + trimmed <= smaller && !memcmp(ap - blk, bp - blk, blk)) { trimmed += blk; ap -= blk; bp -= blk; } - while (recovered < trimmed && 0 <= ctx) + while (recovered < trimmed) if (ap[recovered++] == '\n') - ctx--; + break; a->size -= (trimmed - recovered); b->size -= (trimmed - recovered); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: After many hours all outbound connections get stuck in SYN_SENT
> I do have TCP Sequence # Randomization enabled on my router. Huh? Do you mean a PIX blade in a Cisco switch-router chassis? It would be very useful if you could be less vague about the equipment in use. > However, > if this was causing an issue, wouldn't it always occur and cause > connection issues, not just after 38 hours of correct operation? That depends more on your customers' networking attributes then you are sharing or perhaps even know. Perhaps your customer base is very Window-skewed and you simply aren't seeing any Sack Permitted negotiations for the first 37.999 hours. Or perhaps you've had a network glitch, and all of your connections have done a Selective Ack, which the firewall has trashed, leaving all the connections in a wacko state, not just a few which you haven't noticed. The actual failure mode needs a packet trace to determine, but you should be able to do this yourself (or ask your local network engineering staff). If your firewall is trashing the Sack field, then it needs to be fixed. Time to raise a case with the Cisco TAC and ask them directly if your PIX version has bug CSCse14419. You can't expect Sack to work when it's being fed trash, so it is important to make sure that is not happening. Cheers, Glen #include #undef KERNEL_HACKER -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.24-rc6
On Thu, Dec 20, 2007 at 08:40:54PM -0800, Linus Torvalds wrote: > That was a rather long-winded explanation of what happened, mainly because > it was all very unexpected to me, and I had personally mistakenly thought > the git optimization was perfectly valid and actually had to go through > the end result to see what was going on. > > Anyway, the diff on kernel.org should be all ok now, and mirrored out too. > Thanks again for being so quick to track this down, applies fine and is out for building in rawhide now. cheers, Kyle -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: iommu dma mapping alignment requirements
Benjamin Herrenschmidt wrote: Sounds good. Thanks! Note, that these smaller sub-host-page-sized mappings might pollute the address space causing full aligned host-page-size maps to become scarce... Maybe there's a clever way to keep those in their own segment of the address space? We already have a large vs. small split in the iommu virtual space to alleviate this (though it's not a hard constraint, we can still get into the "other" side if the default one is full). Try that patch and let me know: Seems to be working! :) Index: linux-work/arch/powerpc/kernel/iommu.c === --- linux-work.orig/arch/powerpc/kernel/iommu.c 2007-12-21 10:39:39.0 +1100 +++ linux-work/arch/powerpc/kernel/iommu.c 2007-12-21 10:46:18.0 +1100 @@ -278,6 +278,7 @@ int iommu_map_sg(struct iommu_table *tbl unsigned long flags; struct scatterlist *s, *outs, *segstart; int outcount, incount, i; + unsigned int align; unsigned long handle; BUG_ON(direction == DMA_NONE); @@ -309,7 +310,11 @@ int iommu_map_sg(struct iommu_table *tbl /* Allocate iommu entries for that segment */ vaddr = (unsigned long) sg_virt(s); npages = iommu_num_pages(vaddr, slen); - entry = iommu_range_alloc(tbl, npages, , mask >> IOMMU_PAGE_SHIFT, 0); + align = 0; + if (IOMMU_PAGE_SHIFT < PAGE_SHIFT && (vaddr & ~PAGE_MASK) == 0) + align = PAGE_SHIFT - IOMMU_PAGE_SHIFT; + entry = iommu_range_alloc(tbl, npages, , + mask >> IOMMU_PAGE_SHIFT, align); DBG(" - vaddr: %lx, size: %lx\n", vaddr, slen); @@ -572,7 +577,7 @@ dma_addr_t iommu_map_single(struct iommu { dma_addr_t dma_handle = DMA_ERROR_CODE; unsigned long uaddr; - unsigned int npages; + unsigned int npages, align; BUG_ON(direction == DMA_NONE); @@ -580,8 +585,13 @@ dma_addr_t iommu_map_single(struct iommu npages = iommu_num_pages(uaddr, size); if (tbl) { + align = 0; + if (IOMMU_PAGE_SHIFT < PAGE_SHIFT && + ((unsigned long)vaddr & ~PAGE_MASK) == 0) + align = PAGE_SHIFT - IOMMU_PAGE_SHIFT; + dma_handle = iommu_alloc(tbl, vaddr, npages, direction, -mask >> IOMMU_PAGE_SHIFT, 0); +mask >> IOMMU_PAGE_SHIFT, align); if (dma_handle == DMA_ERROR_CODE) { if (printk_ratelimit()) { printk(KERN_INFO "iommu_alloc failed, " -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] pci: Remove pci_enable_device_bars()
pci: Remove pci_enable_device_bars() fix for qla The previous patch missed one occurence of pci_enable_device_bars() in the qla2xxx driver. This fixes it. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> --- Index: linux-merge/drivers/scsi/qla2xxx/qla_def.h === --- linux-merge.orig/drivers/scsi/qla2xxx/qla_def.h 2007-12-21 15:45:41.0 +1100 +++ linux-merge/drivers/scsi/qla2xxx/qla_def.h 2007-12-21 15:46:12.0 +1100 @@ -2272,6 +2272,7 @@ typedef struct scsi_qla_host { spinlock_t hardware_lock cacheline_aligned; int bars; + int mem_only; device_reg_t __iomem *iobase; /* Base I/O address */ unsigned long pio_address; unsigned long pio_length; Index: linux-merge/drivers/scsi/qla2xxx/qla_os.c === --- linux-merge.orig/drivers/scsi/qla2xxx/qla_os.c 2007-12-21 15:46:10.0 +1100 +++ linux-merge/drivers/scsi/qla2xxx/qla_os.c 2007-12-21 15:46:12.0 +1100 @@ -1626,6 +1626,7 @@ qla2x00_probe_one(struct pci_dev *pdev, sprintf(ha->host_str, "%s_%ld", QLA2XXX_DRIVER_NAME, ha->host_no); ha->parent = NULL; ha->bars = bars; + ha->mem_only = mem_only; /* Set ISP-type information. */ qla2x00_set_isp_flags(ha); @@ -2905,8 +2906,14 @@ qla2xxx_pci_slot_reset(struct pci_dev *p { pci_ers_result_t ret = PCI_ERS_RESULT_DISCONNECT; scsi_qla_host_t *ha = pci_get_drvdata(pdev); + int rc; - if (pci_enable_device_bars(pdev, ha->bars)) { + if (ha->mem_only) + rc = pci_enable_device_mem(pdev); + else + rc = pci_enable_device(pdev); + + if (rc) { qla_printk(KERN_WARNING, ha, "Can't re-enable PCI device after reset.\n"); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-stable causes oomkiller to be invoked
> > It was just > > > > while echo ; do cat /sys/kernel/ ; done > > > > it's all in the email threads somewhere.. > > The patch that was posted in the thread that I mentioned earlier is here. > I ran the test for 15 minutes and things are still fine. > > > > quicklist: Set tlb->need_flush if pages are remaining in quicklist 0 > > This ensures that the quicklists are drained. Otherwise draining may only > occur when the processor reaches an idle state. > Hi Christoph, No, it does not stop the oom I am seeing here. Thanks, > Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> > > Index: linux-2.6/include/asm-generic/tlb.h > === > --- linux-2.6.orig/include/asm-generic/tlb.h 2007-12-13 14:45:38.0 > -0800 > +++ linux-2.6/include/asm-generic/tlb.h 2007-12-13 14:51:07.0 > -0800 > @@ -14,6 +14,7 @@ > #define _ASM_GENERIC__TLB_H > > #include > +#include > #include > #include > > @@ -85,6 +86,9 @@ tlb_flush_mmu(struct mmu_gather *tlb, un > static inline void > tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long > end) > { > +#ifdef CONFIG_QUICKLIST > + tlb->need_flush += &__get_cpu_var(quicklist)[0].nr_pages != 0; > +#endif > tlb_flush_mmu(tlb, start, end); > > /* keep the page table cache within bounds */ -- regards, Dhaval -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] pci: Remove pci_enable_device_bars()
On Thu, 2007-12-20 at 20:28 -0800, Greg KH wrote: > On Thu, Dec 20, 2007 at 03:28:10PM +1100, Benjamin Herrenschmidt wrote: > > Now that all in-tree users are gone, this removes pci_enable_device_bars() > > completely. > > Hm, looks like you missed drivers/scsi/qla2xxx/qla_os.c > > Quick, before akpm gets mad at you for breaking the build, send me a > patch! :) Argh... there was 2 users in that file and I fixed only one... Followup patch in a blink. Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.24-rc6
On Thu, 20 Dec 2007, Linus Torvalds wrote: > > It only happened for a few files that had lots of repeated lines - so that > the diff could literally be done multiple different ways - and in fact, > the file that caused the problems really had a bogus commit that > duplicated *way* too much data, and caused lots of #define's to exist > twice. Here's the example of this kind of behaviour: in the 2.6.26-rc5 tree the file drivers/video/mbx/reg_bits.h has the #defines for /* DINTRS - Display Interrupt Status Register */ /* DINTRE - Display Interrupt Enable Register */ duplicated twice due to commit ba282daa919f89c871780f344a71e5403a70b634 ("mbxfb: Improvements and new features") by Raphael Assenat mistakenly adding another copy of the same old set of defines that we already got added once before by commit fb137d5b7f2301f2717944322bba38039083c431 ("mbxfb: Add more registers bits access macros"). Now, that was a mistake - and one that probably happened because Rafael or more likely Andrew Morton used GNU patch with its insane defaults (which is to happily apply the same patch that adds things twice, because it doesn't really care if the context matches or not). But what that kind of thing causes is that when you create a patch of the end result, it can show the now new duplicate lines two different (but equally valid) ways: it can show it as an addition of the _first_ set of lines, or it can show it as an addition of the _second_ set of lines. They are the same, after all. Now, it doesn't really matter which way you choose to show it, although because of how "git diff" finds similarities, it tends to prefer to show the second set of identical lines as the "new" ones. Which is generally reasonable. However, that interacted really badly with the new git logic that said that "if the two files end in the same sequence, just ignore the common tail of the file", because the latter copy of the identical lines would now show up as _part_ of that common tail, so the lines that the git diff machinery would normally like to show up as "new" did in fact end up being considered uninteresting, because they were part of an idential tail. So now "git diff" would happily pick _earlier_ lines as the new ones, and it would still be a conceptually valid diff, but because we had trimmed the tail of the file, that conceptually valid diff no longer had the expected shared context at the end. And while it's a bit embarrassing, I'm really rather happy that both GNU patch and "git apply" actually refused to apply the patch. It may have been "conceptually correct" (ie it did really contain all of the changes!) but because it lacked the expected context it really wasn't a good patch. That was a rather long-winded explanation of what happened, mainly because it was all very unexpected to me, and I had personally mistakenly thought the git optimization was perfectly valid and actually had to go through the end result to see what was going on. Anyway, the diff on kernel.org should be all ok now, and mirrored out too. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] sg_ring for scsi
On Fri, 21 Dec 2007 14:26:47 +1100 Rusty Russell <[EMAIL PROTECTED]> wrote: > On Friday 21 December 2007 13:28:34 FUJITA Tomonori wrote: > > I'm not sure about chaining the headers (as your sg_ring and > > scsi_sgtable do) would simplify LLDs. Have you looked at ips or > > qla1280? > > Not yet, am working my way through the drivers, but I don't expect it will be > a simplification to the normal SCSI LLDs. Most of them are mere consumers of > sgs... Some scsi drivers like ips access to sglist in a tricky way. I feel that they don't work with the sg_ring interface well. So if you convert scsi_lib.c to use sg_ring, please see how it works with the tricky drivers before that. > I'm not a SCSI person: I'm patching SCSI because I have to to get my > own sg-using code clean :) I'm SCSI-biased. If you don't convert scsi to use sg_ring, I don't complain. :) Though it would be better to have only one mechanism to handle large sglist in kernel. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] pci: Remove pci_enable_device_bars()
On Thu, Dec 20, 2007 at 03:28:10PM +1100, Benjamin Herrenschmidt wrote: > Now that all in-tree users are gone, this removes pci_enable_device_bars() > completely. Hm, looks like you missed drivers/scsi/qla2xxx/qla_os.c Quick, before akpm gets mad at you for breaking the build, send me a patch! :) thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.24-rc6
On Thu, 20 Dec 2007, Linus Torvalds wrote: > > The tar-ball and the git archive itself is fine, but yes, the diff from > 2.6.23 to 2.6.24-rc6 is bad. It's the "trim_common_tail()" optimization > that has caused way too much pain. Very interesting breakage. The patch was actually "correct" in a (rather limited) technical sense, but the context at the end was missing because while the trim_common_tail() code made sure to keep enough common context to allow a valid diff to be generated, the diff machinery itself could decide that it could generate the diff differently than the "obvious" solution. It only happened for a few files that had lots of repeated lines - so that the diff could literally be done multiple different ways - and in fact, the file that caused the problems really had a bogus commit that duplicated *way* too much data, and caused lots of #define's to exist twice. But the sad fact appears that the git optimization (which is very important for "git blame", which needs no context), is only really valid for that one case where we really don't need any context. I uploaded a fixed patch. And here's the git patch to avoid this optimization when there is context. Linus --- xdiff-interface.c | 12 ++-- 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/xdiff-interface.c b/xdiff-interface.c index 9ee877c..0b7e057 100644 --- a/xdiff-interface.c +++ b/xdiff-interface.c @@ -110,22 +110,22 @@ int xdiff_outf(void *priv_, mmbuffer_t *mb, int nbuf) static void trim_common_tail(mmfile_t *a, mmfile_t *b, long ctx) { const int blk = 1024; - long trimmed = 0, recovered = 0; + long trimmed = 0; char *ap = a->ptr + a->size; char *bp = b->ptr + b->size; long smaller = (a->size < b->size) ? a->size : b->size; + if (ctx) + return; + while (blk + trimmed <= smaller && !memcmp(ap - blk, bp - blk, blk)) { trimmed += blk; ap -= blk; bp -= blk; } - while (recovered < trimmed && 0 <= ctx) - if (ap[recovered++] == '\n') - ctx--; - a->size -= (trimmed - recovered); - b->size -= (trimmed - recovered); + a->size -= trimmed; + b->size -= trimmed; } int xdi_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp, xdemitconf_t const *xecfg, xdemitcb_t *xecb) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
Loic Prylli wrote: Just curious, do you know of any system where that recommendation was not followed? On all motherboards where I have seen a AMD-8131 or a AMD-8132, they were alone on their hypertransport link, and other "northbridges" (more precisely hypertransport to pci-express or pci-whatever, often nvidia) with a "MMCONFIG BAR" where on one of the other available hypertransport links in the system. Loic Here is the PCI configuration of the HP DL585G2. You can see two nVidia CK804 PCIE root ports at bus 0 and bus 0x40. Each of them has an 8132 connected as a subordinate bridge. [EMAIL PROTECTED] ~]# lspci -vt -+-[:40]-+-00.0 nVidia Corporation CK804 Memory Controller | +-01.0 nVidia Corporation CK804 Memory Controller | +-0b.0-[:4f-51]-- | +-0c.0-[:4c-4e]-- | +-0d.0-[:49-4b]-- | +-0e.0-[:46-48]-- | +-10.0-[:41]--+-01.0 Broadcom Corporation NetXtreme II BCM5706 Gigabit Ethernet | | \-02.0 Broadcom Corporation NetXtreme II BCM5706 Gigabit Ethernet | +-10.1 Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC | +-11.0-[:42-45]-- | \-11.1 Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC \-[:00]-+-00.0 nVidia Corporation CK804 Memory Controller +-01.0 nVidia Corporation CK804 ISA Bridge +-02.0 nVidia Corporation CK804 USB Controller +-02.1 nVidia Corporation CK804 USB Controller +-06.0 nVidia Corporation CK804 IDE +-09.0-[:01]--+-03.0 ATI Technologies Inc ES1000 | +-04.0 Compaq Computer Corporation Integrated Lights Out Controller | +-04.2 Compaq Computer Corporation Integrated Lights Out Processor | +-04.4 Hewlett-Packard Company Proliant iLO2 virtual USB controller | \-04.6 Hewlett-Packard Company Proliant iLO2 virtual UART +-0c.0-[:08-0a]00.0 Hewlett-Packard Company Smart Array Controller +-0d.0-[:05-07]-- +-0e.0-[:02-04]-- +-18.0 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration +-18.1 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map +-18.2 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller +-18.3 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control +-19.0 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration +-19.1 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map +-19.2 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller +-19.3 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control +-1a.0 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration +-1a.1 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map +-1a.2 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller +-1a.3 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control +-1b.0 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration +-1b.1 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map +-1b.2 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller \-1b.3 Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control [EMAIL PROTECTED] ~]# -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.24-rc6
On Thu, 20 Dec 2007, Kyle McMartin wrote: > > I think I see the problem, it's lack of context in the diff, No, the problem is that "git diff" is apparently broken by a recent optimization. The diff is simply broken. The tar-ball and the git archive itself is fine, but yes, the diff from 2.6.23 to 2.6.24-rc6 is bad. It's the "trim_common_tail()" optimization that has caused way too much pain. Sorry about that, I'll fix it up asap. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.24-rc6
On Thu, Dec 20, 2007 at 07:49:05PM -0800, Linus Torvalds wrote: > > > On Thu, 20 Dec 2007, Kyle McMartin wrote: > > > > I think I see the problem, it's lack of context in the diff, > > No, the problem is that "git diff" is apparently broken by a recent > optimization. The diff is simply broken. > > The tar-ball and the git archive itself is fine, but yes, the diff from > 2.6.23 to 2.6.24-rc6 is bad. It's the "trim_common_tail()" optimization > that has caused way too much pain. > > Sorry about that, I'll fix it up asap. > no biggie, thanks! cheers, Kyle -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
On 12/20/2007 9:15 PM, Robert Hancock wrote: >> >> Suggested Workaround >> >> It is strongly recommended that system designers do not connect the >> AMD-8132 and devices that use extended >> configuration space MMIO BARs (ex: HyperTransport-to-PCI Express® >> bridges) to the same processor >> HyperTransport link. >> >> Fix Planned >> No > > That does sound fairly definitive. I have to wonder why certain system > designers then didn't follow their strong recommendation.. Just curious, do you know of any system where that recommendation was not followed? On all motherboards where I have seen a AMD-8131 or a AMD-8132, they were alone on their hypertransport link, and other "northbridges" (more precisely hypertransport to pci-express or pci-whatever, often nvidia) with a "MMCONFIG BAR" where on one of the other available hypertransport links in the system. Loic -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 14/24] Immediate Values - x86 Optimization
Mathieu Desnoyers wrote: Argh.. Rusty asked to have a simplified version first, and then to implement the "more complex" one on top of it. However, in order to get the reentrancy I need for the markers, I need the complex version of the immediate values. Therefore, you find, in this patchset, the simple version first, and then, the more complex one implemented on top. About this patch header, the initial idea was to use the "Q" and "R" constraints, but, as stated just below, the "q" and "r" constraints are used instead to make sure the REX prefixed opcodes for 1, 2, and 4 bytes immediate values are never used. So the complete header follows the source code, it's just that this paragraph could be clearer. Then you have it backwards. "Q" and "R" avoid REX prefixes, "q" and "r" DO NOT. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] sg_ring for scsi
On Friday 21 December 2007 13:28:34 FUJITA Tomonori wrote: > I'm not sure about chaining the headers (as your sg_ring and > scsi_sgtable do) would simplify LLDs. Have you looked at ips or > qla1280? Not yet, am working my way through the drivers, but I don't expect it will be a simplification to the normal SCSI LLDs. Most of them are mere consumers of sgs... I'm not a SCSI person: I'm patching SCSI because I have to to get my own sg-using code clean :) Hope that clarifies, Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mbx: Fix up duplicate defines in reg_bits.h
Otherwise patch gets horribly confused and falls over applying the diff. Not sure why these were being defined twice. Signed-off-by: Kyle McMartin <[EMAIL PROTECTED]> --- Well, we can get it fixed for -git1, I respun the patch-2.6.24-rc6 diff with git diff -p v2.6.23..HEAD and applied it to a pristine linux-2.6.23 tree without issue. cheers, Kyle drivers/video/mbx/reg_bits.h | 24 1 files changed, 0 insertions(+), 24 deletions(-) diff --git a/drivers/video/mbx/reg_bits.h b/drivers/video/mbx/reg_bits.h index 5f14b4b..8dc4283 100644 --- a/drivers/video/mbx/reg_bits.h +++ b/drivers/video/mbx/reg_bits.h @@ -540,30 +540,6 @@ #define DINTRE_HBLNK1_EN (1 << 1) #define DINTRE_HBLNK0_EN (1 << 0) -/* DINTRS - Display Interrupt Status Register */ -#define DINTRS_CUR_OR_S(1 << 18) -#define DINTRS_STR2_OR_S (1 << 17) -#define DINTRS_STR1_OR_S (1 << 16) -#define DINTRS_CUR_UR_S(1 << 6) -#define DINTRS_STR2_UR_S (1 << 5) -#define DINTRS_STR1_UR_S (1 << 4) -#define DINTRS_VEVENT1_S (1 << 3) -#define DINTRS_VEVENT0_S (1 << 2) -#define DINTRS_HBLNK1_S(1 << 1) -#define DINTRS_HBLNK0_S(1 << 0) - -/* DINTRE - Display Interrupt Enable Register */ -#define DINTRE_CUR_OR_EN (1 << 18) -#define DINTRE_STR2_OR_EN (1 << 17) -#define DINTRE_STR1_OR_EN (1 << 16) -#define DINTRE_CUR_UR_EN (1 << 6) -#define DINTRE_STR2_UR_EN (1 << 5) -#define DINTRE_STR1_UR_EN (1 << 4) -#define DINTRE_VEVENT1_EN (1 << 3) -#define DINTRE_VEVENT0_EN (1 << 2) -#define DINTRE_HBLNK1_EN (1 << 1) -#define DINTRE_HBLNK0_EN (1 << 0) - /* DLSTS - display load status register */ #define DLSTS_RLD_ADONE(1 << 23) -- 1.5.3.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 14/24] Immediate Values - x86 Optimization
* H. Peter Anvin ([EMAIL PROTECTED]) wrote: > This patch is modified by another patch in the sequence. This feels > needlessly confusing when reviewing (especially since the comment doesn't > look to match the code, e.g. w.r.t to "Q" and "R" constraints); can you > reorder the patchset to avoid that? > Argh.. Rusty asked to have a simplified version first, and then to implement the "more complex" one on top of it. However, in order to get the reentrancy I need for the markers, I need the complex version of the immediate values. Therefore, you find, in this patchset, the simple version first, and then, the more complex one implemented on top. About this patch header, the initial idea was to use the "Q" and "R" constraints, but, as stated just below, the "q" and "r" constraints are used instead to make sure the REX prefixed opcodes for 1, 2, and 4 bytes immediate values are never used. So the complete header follows the source code, it's just that this paragraph could be clearer. Mathieu > -hpa -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags
From: Matt Mackall <[EMAIL PROTECTED]> Date: Thu, 20 Dec 2007 19:06:55 -0600 > @@ -707,7 +707,10 @@ static ssize_t kpagecount_read(struct fi > return -EIO; > > while (count > 0) { > - ppage = pfn_to_page(pfn++); > + ppage = 0; > + if (pfn_valid(pfn)) > + ppage = pfn_to_page(pfn); > + pfn++; > if (!ppage) > pcount = 0; > else Yes that should work, please use "NULL" in the final version of the patch instead of "0" so that sparse is happy. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Trailing periods in kernel messages
From: Andrew Morton <[EMAIL PROTECTED]> Date: Thu, 20 Dec 2007 18:15:32 -0800 > No-period is a kernel idiom, produces perfectly readable output, I have > never ever heard of anyone expressing the least concern over a lack of dots > at the end of their printks and 91% of kernel code agrees. I have never heard of a compiler expressing the least concern over whitespace and other aspects of coding style. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/10] sysfs network namespace support
On Sat, Dec 01, 2007 at 02:06:58AM -0700, Eric W. Biederman wrote: > > Now that we have network namespace support merged it is time to > revisit the sysfs support so we can remove the dependency on !SYSFS. Oops, I forgot to apply this to my tree. Eric, you still want this submitted, right? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 14/24] Immediate Values - x86 Optimization
This patch is modified by another patch in the sequence. This feels needlessly confusing when reviewing (especially since the comment doesn't look to match the code, e.g. w.r.t to "Q" and "R" constraints); can you reorder the patchset to avoid that? -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
patch pci-remove-users-of-pci_enable_device_bars.patch added to gregkh-2.6 tree
This is a note to let you know that I've just added the patch titled Subject: PCI: Remove users of pci_enable_device_bars() to my gregkh-2.6 tree. Its filename is pci-remove-users-of-pci_enable_device_bars.patch This tree can be found at http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/ >From [EMAIL PROTECTED] Wed Dec 19 20:30:44 2007 From: Benjamin Herrenschmidt <[EMAIL PROTECTED]> Date: Thu, 20 Dec 2007 15:28:09 +1100 Subject: PCI: Remove users of pci_enable_device_bars() To: Greg Kroah-Hartman <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED], , <[EMAIL PROTECTED]>, Alan Cox <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]> Message-ID: <[EMAIL PROTECTED]> This patch converts users of pci_enable_device_bars() to the new pci_enable_device_{io,mem} interface. The new API fits nicely, except maybe for the QLA case where a bit of code re-organization might be a good idea but I prefer sticking to the simple patch as I don't have hardware to test on. I'll also need some feedback on the cs5520 change. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/ata/pata_cs5520.c |2 +- drivers/i2c/busses/scx200_acb.c |2 +- drivers/ide/pci/cs5520.c| 10 -- drivers/ide/setup-pci.c |6 -- drivers/scsi/lpfc/lpfc_init.c |3 +-- drivers/scsi/qla2xxx/qla_os.c | 12 +--- 6 files changed, 24 insertions(+), 11 deletions(-) --- a/drivers/ata/pata_cs5520.c +++ b/drivers/ata/pata_cs5520.c @@ -229,7 +229,7 @@ static int __devinit cs5520_init_one(str return -ENOMEM; /* Perform set up for DMA */ - if (pci_enable_device_bars(pdev, 1<<2)) { + if (pci_enable_device_io(pdev)) { printk(KERN_ERR DRV_NAME ": unable to configure BAR2.\n"); return -ENODEV; } --- a/drivers/i2c/busses/scx200_acb.c +++ b/drivers/i2c/busses/scx200_acb.c @@ -492,7 +492,7 @@ static __init int scx200_create_pci(cons iface->pdev = pdev; iface->bar = bar; - rc = pci_enable_device_bars(iface->pdev, 1 << iface->bar); + rc = pci_enable_device_io(iface->pdev); if (rc) goto errout_free; --- a/drivers/ide/pci/cs5520.c +++ b/drivers/ide/pci/cs5520.c @@ -160,8 +160,14 @@ static int __devinit cs5520_init_one(str ide_setup_pci_noise(dev, d); /* We must not grab the entire device, it has 'ISA' space in its - BARS too and we will freak out other bits of the kernel */ - if (pci_enable_device_bars(dev, 1<<2)) { +* BARS too and we will freak out other bits of the kernel +* +* pci_enable_device_bars() is going away. I replaced it with +* IO only enable for now but I'll need confirmation this is +* allright for that device. If not, it will need some kind of +* quirk. --BenH. +*/ + if (pci_enable_device_io(dev)) { printk(KERN_WARNING "%s: Unable to enable 55x0.\n", d->name); return -ENODEV; } --- a/drivers/ide/setup-pci.c +++ b/drivers/ide/setup-pci.c @@ -236,7 +236,9 @@ EXPORT_SYMBOL_GPL(ide_setup_pci_noise); * @d: IDE port info * * Enable the IDE PCI device. We attempt to enable the device in full - * but if that fails then we only need BAR4 so we will enable that. + * but if that fails then we only need IO space. The PCI code should + * have setup the proper resources for us already for controllers in + * legacy mode. * * Returns zero on success or an error code */ @@ -246,7 +248,7 @@ static int ide_pci_enable(struct pci_dev int ret; if (pci_enable_device(dev)) { - ret = pci_enable_device_bars(dev, 1 << 4); + ret = pci_enable_device_io(dev); if (ret < 0) { printk(KERN_WARNING "%s: (ide_setup_pci_device:) " "Could not enable device.\n", d->name); --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -2100,10 +2100,9 @@ static pci_ers_result_t lpfc_io_slot_res struct Scsi_Host *shost = pci_get_drvdata(pdev); struct lpfc_hba *phba = ((struct lpfc_vport *)shost->hostdata)->phba; struct lpfc_sli *psli = >sli; - int bars = pci_select_bars(pdev, IORESOURCE_MEM); dev_printk(KERN_INFO, >dev, "recovering from a slot reset.\n"); - if (pci_enable_device_bars(pdev, bars)) { + if (pci_enable_device_mem(pdev)) { printk(KERN_ERR "lpfc: Cannot re-enable " "PCI device after reset.\n"); return PCI_ERS_RESULT_DISCONNECT; --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1583,7 +1583,7 @@ qla2x00_probe_one(struct pci_dev *pdev, char pci_info[30]; char fw_str[30];
Re: Linux 2.6.24-rc6
On Thu, Dec 20, 2007 at 09:48:05PM -0500, Kyle McMartin wrote: > 1 out of 3 hunks FAILED -- saving rejects to file > drivers/video/mbx/reg_bits.h.rej > error: Bad exit status from /var/tmp/rpm-tmp.22316 (%prep) > I think I see the problem, it's lack of context in the diff, commit ba282daa919f89c871780f344a71e5403a70b634 Author: Raphael Assenat <[EMAIL PROTECTED]> Date: Tue Oct 16 01:28:40 2007 -0700 seems to duplicate the DINTRS & DINTRE defines for no obvious reason, confusing the hell out of patch. regards, Kyle -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
patch pci-remove-pci_enable_device_bars.patch added to gregkh-2.6 tree
This is a note to let you know that I've just added the patch titled Subject: PCI: Remove pci_enable_device_bars() to my gregkh-2.6 tree. Its filename is pci-remove-pci_enable_device_bars.patch This tree can be found at http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/ >From [EMAIL PROTECTED] Wed Dec 19 20:30:57 2007 From: Benjamin Herrenschmidt <[EMAIL PROTECTED]> Date: Thu, 20 Dec 2007 15:28:10 +1100 Subject: PCI: Remove pci_enable_device_bars() To: Greg Kroah-Hartman <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED], , <[EMAIL PROTECTED]>, Alan Cox <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]> Message-ID: <[EMAIL PROTECTED]> Now that all in-tree users are gone, this removes pci_enable_device_bars() completely. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/pci/pci.c | 24 include/linux/pci.h |1 - 2 files changed, 25 deletions(-) --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -741,29 +741,6 @@ int pci_reenable_device(struct pci_dev * return 0; } -/** - * pci_enable_device_bars - Initialize some of a device for use - * @dev: PCI device to be initialized - * @bars: bitmask of BAR's that must be configured - * - * Initialize device before it's used by a driver. Ask low-level code - * to enable selected I/O and memory resources. Wake up the device if it - * was suspended. Beware, this function can fail. - */ -int -pci_enable_device_bars(struct pci_dev *dev, int bars) -{ - int err; - - if (atomic_add_return(1, >enable_cnt) > 1) - return 0; /* already enabled */ - - err = do_pci_enable_device(dev, bars); - if (err < 0) - atomic_dec(>enable_cnt); - return err; -} - static int __pci_enable_device_flags(struct pci_dev *dev, resource_size_t flags) { @@ -1695,7 +1672,6 @@ early_param("pci", pci_setup); device_initcall(pci_init); EXPORT_SYMBOL(pci_reenable_device); -EXPORT_SYMBOL(pci_enable_device_bars); EXPORT_SYMBOL(pci_enable_device_io); EXPORT_SYMBOL(pci_enable_device_mem); EXPORT_SYMBOL(pci_enable_device); --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -543,7 +543,6 @@ static inline int pci_write_config_dword } int __must_check pci_enable_device(struct pci_dev *dev); -int __must_check pci_enable_device_bars(struct pci_dev *dev, int mask); int __must_check pci_enable_device_io(struct pci_dev *dev); int __must_check pci_enable_device_mem(struct pci_dev *dev); int __must_check pci_reenable_device(struct pci_dev *); Patches currently in gregkh-2.6 which might be from [EMAIL PROTECTED] are bad/battery-class-driver.patch driver/adb-convert-from-class_device-to-device.patch driver/kobject-convert-hvc_console-to-use-kref-not-kobject.patch driver/kobject-convert-hvcs-to-use-kref-not-kobject.patch driver/kobject-convert-icom-to-use-kref-not-kobject.patch pci/pci-fix-bus-resource-assignment-on-32-bits-with-64b-resources.patch pci/pci-fix-warning-in-setup-res.c-on-32-bit-platforms-with-64-bit-resources.patch pci/pci-add-pci_enable_device_-io-mem-intefaces.patch pci/pci-remove-pci_enable_device_bars.patch pci/pci-remove-users-of-pci_enable_device_bars.patch usb/usb-remove-ohci-useless-masking-unmasking-of-wdh-interrupt.patch -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
patch pci-add-pci_enable_device_-io-mem-intefaces.patch added to gregkh-2.6 tree
This is a note to let you know that I've just added the patch titled Subject: PCI: Add pci_enable_device_{io,mem} intefaces to my gregkh-2.6 tree. Its filename is pci-add-pci_enable_device_-io-mem-intefaces.patch This tree can be found at http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/ >From [EMAIL PROTECTED] Wed Dec 19 20:30:44 2007 From: Benjamin Herrenschmidt <[EMAIL PROTECTED]> Date: Thu, 20 Dec 2007 15:28:08 +1100 Subject: PCI: Add pci_enable_device_{io,mem} intefaces To: Greg Kroah-Hartman <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED], , <[EMAIL PROTECTED]>, Alan Cox <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]> Message-ID: <[EMAIL PROTECTED]> The pci_enable_device_bars() interface isn't well suited to PCI because you can't actually enable/disable BARs individually on a device. So for example, if a device has 2 memory BARs 0 and 1, and one of them (let's say 1) has not been successfully allocated by the firmware or the kernel, then enabling memory decoding shouldn't be permitted for the entire device since it will decode whatever random address is still in that BAR 1. So a device must be either fully enabled for IO, for Memory, or for both. Not on a per-BAR basis. This provides two new functions, pci_enable_device_io() and pci_enable_device_mem() to replace pci_enable_device_bars(). The implementation internally builds a BAR mask in order to be able to use existing arch infrastructure. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> Acked-by: Ivan Kokshaysky <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/pci/pci.c | 49 - include/linux/pci.h |2 ++ 2 files changed, 50 insertions(+), 1 deletion(-) --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -764,6 +764,51 @@ pci_enable_device_bars(struct pci_dev *d return err; } +static int __pci_enable_device_flags(struct pci_dev *dev, +resource_size_t flags) +{ + int err; + int i, bars = 0; + + if (atomic_add_return(1, >enable_cnt) > 1) + return 0; /* already enabled */ + + for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) + if (dev->resource[i].flags & flags) + bars |= (1 << i); + + err = do_pci_enable_device(dev, bars); + if (err < 0) + atomic_dec(>enable_cnt); + return err; +} + +/** + * pci_enable_device_io - Initialize a device for use with IO space + * @dev: PCI device to be initialized + * + * Initialize device before it's used by a driver. Ask low-level code + * to enable I/O resources. Wake up the device if it was suspended. + * Beware, this function can fail. + */ +int pci_enable_device_io(struct pci_dev *dev) +{ + return __pci_enable_device_flags(dev, IORESOURCE_IO); +} + +/** + * pci_enable_device_mem - Initialize a device for use with Memory space + * @dev: PCI device to be initialized + * + * Initialize device before it's used by a driver. Ask low-level code + * to enable Memory resources. Wake up the device if it was suspended. + * Beware, this function can fail. + */ +int pci_enable_device_mem(struct pci_dev *dev) +{ + return __pci_enable_device_flags(dev, IORESOURCE_MEM); +} + /** * pci_enable_device - Initialize device before it's used by a driver. * @dev: PCI device to be initialized @@ -777,7 +822,7 @@ pci_enable_device_bars(struct pci_dev *d */ int pci_enable_device(struct pci_dev *dev) { - return pci_enable_device_bars(dev, (1 << PCI_NUM_RESOURCES) - 1); + return __pci_enable_device_flags(dev, IORESOURCE_MEM | IORESOURCE_IO); } /* @@ -1651,6 +1696,8 @@ device_initcall(pci_init); EXPORT_SYMBOL(pci_reenable_device); EXPORT_SYMBOL(pci_enable_device_bars); +EXPORT_SYMBOL(pci_enable_device_io); +EXPORT_SYMBOL(pci_enable_device_mem); EXPORT_SYMBOL(pci_enable_device); EXPORT_SYMBOL(pcim_enable_device); EXPORT_SYMBOL(pcim_pin_device); --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -544,6 +544,8 @@ static inline int pci_write_config_dword int __must_check pci_enable_device(struct pci_dev *dev); int __must_check pci_enable_device_bars(struct pci_dev *dev, int mask); +int __must_check pci_enable_device_io(struct pci_dev *dev); +int __must_check pci_enable_device_mem(struct pci_dev *dev); int __must_check pci_reenable_device(struct pci_dev *); int __must_check pcim_enable_device(struct pci_dev *pdev); void pcim_pin_device(struct pci_dev *pdev); Patches currently in gregkh-2.6 which might be from [EMAIL PROTECTED] are bad/battery-class-driver.patch driver/adb-convert-from-class_device-to-device.patch driver/kobject-convert-hvc_console-to-use-kref-not-kobject.patch driver/kobject-convert-hvcs-to-use-kref-not-kobject.patch driver/kobject-convert-icom-to-use-kref-not-kobject.patch
patch pci-correctly-initialize-a-structure-for-pcie_save_pcix_state.patch added to gregkh-2.6 tree
This is a note to let you know that I've just added the patch titled Subject: PCI: correctly initialize a structure for pcie_save_pcix_state() to my gregkh-2.6 tree. Its filename is pci-correctly-initialize-a-structure-for-pcie_save_pcix_state.patch This tree can be found at http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/ >From [EMAIL PROTECTED] Mon Dec 17 18:02:37 2007 From: Shaohua Li <[EMAIL PROTECTED]> Date: Tue, 18 Dec 2007 09:56:56 +0800 Subject: PCI: correctly initialize a structure for pcie_save_pcix_state() To: lkml Cc: Andrew Morton <[EMAIL PROTECTED]>, Greg KH <[EMAIL PROTECTED]> Message-ID: <[EMAIL PROTECTED]> save_state->cap_nr should be correctly set, otherwise we can't find the saved cap at resume. Signed-off-by: Shaohua Li <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/pci/pci.c |2 ++ 1 file changed, 2 insertions(+) --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -587,6 +587,7 @@ static int pci_save_pcie_state(struct pc pci_read_config_word(dev, pos + PCI_EXP_LNKCTL, [i++]); pci_read_config_word(dev, pos + PCI_EXP_SLTCTL, [i++]); pci_read_config_word(dev, pos + PCI_EXP_RTCTL, [i++]); + save_state->cap_nr = PCI_CAP_ID_EXP; pci_add_saved_cap(dev, save_state); return 0; } @@ -630,6 +631,7 @@ static int pci_save_pcix_state(struct pc cap = (u16 *)_state->data[0]; pci_read_config_word(dev, pos + PCI_X_CMD, [i++]); + save_state->cap_nr = PCI_CAP_ID_PCIX; pci_add_saved_cap(dev, save_state); return 0; } Patches currently in gregkh-2.6 which might be from [EMAIL PROTECTED] are driver/kobject-change-drivers-cpuidle-sysfs.c-to-use-kobject_init_and_add.patch pci/pcie-port-driver-correctly-detect-native-pme-feature.patch pci/pcie-utilize-pcie-transaction-pending-bit.patch pci/pci-add-pci-quirk-function-for-some-chipsets.patch pci/pci-avoid-save-the-same-type-of-cap-multiple-times.patch pci/pci-correctly-initialize-a-structure-for-pcie_save_pcix_state.patch pci/pci-fix-typo-in-pci_save_pcix_state.patch -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/5] dma_map_sg_ring() helper
On Friday 21 December 2007 11:40:00 David Miller wrote: > From: Rusty Russell <[EMAIL PROTECTED]> > Date: Fri, 21 Dec 2007 11:35:12 +1100 > > > On Friday 21 December 2007 11:00:27 FUJITA Tomonori wrote: > > > We need to pass the whole sg entries to the IOMMUs at a time. > > > > Hi Fujita, > > > > OK, it's certainly possible to have an arch override. For which > > architecture is this BTW? > > SPARC64, POWERPC, maybe IA-64 etc. > > Basically any platform that potentially does virtual > remamping and thus linearization. Fujita said "need" which confused me. I already said it should be handed down as an optimization; I was curious what I had broken :) > I think it should always be provided, the new APIs give > less information to the implementation and that's a step > backwards. Absolutely. In fact, I think the sg_ring header would be made safer if it had the "dma_num" in it as well: it's more explicit and less surprising to the caller than mangling sg->num. How are these two patches then? === Introduce sg_ring: a ring of scatterlist arrays. This patch introduces 'struct sg_ring', a layer on top of scatterlist arrays. It meshes nicely with routines which expect a simple array of 'struct scatterlist' because it is easy to break down the ring into its constituent arrays. The sg_ring header also encodes the maximum number of entries, useful for routines which populate an sg. We need never hand around a number of elements any more. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> --- include/linux/sg_ring.h | 74 1 files changed, 74 insertions(+), 0 deletions(-) create mode 100644 include/linux/sgring.h diff --git a/include/linux/sg_ring.h b/include/linux/sg_ring.h new file mode 100644 --- /dev/null +++ b/include/linux/sg_ring.h @@ -0,0 +1,128 @@ +#ifndef _LINUX_SG_RING_H +#define _LINUX_SG_RING_H +#include + +/** + * struct sg_ring - a ring of scatterlists + * @list: the list_head chaining them together + * @num: the number of valid sg entries + * @dma_num: the number of valid sg entries after dma mapping + * @max: the maximum number of sg entries (size of the sg array). + * @sg: the array of scatterlist entries. + * + * This provides a convenient encapsulation of one or more scatter gather + * arrays. dma_map_sg_ring() (and friends) set @dma_num: some architectures + * coalesce sg entries, to this will be < num. + */ +struct sg_ring +{ + struct list_head list; + unsigned int num, dma_num, max; + struct scatterlist sg[0]; +}; + +/* This helper declares an sg ring on the stack or in a struct. */ +#define DECLARE_SG_RING(name, max) \ + struct {\ + struct sg_ring ring;\ + struct scatterlist sg[max]; \ + } name + +/** + * sg_ring_init - initialize a scatterlist ring. + * @sg: the sg_ring. + * @max: the size of the trailing sg array. + * + * After initialization sg is alone in the ring. + */ +static inline void sg_ring_init(struct sg_ring *sg, unsigned int max) +{ +#ifdef CONFIG_DEBUG_SG + unsigned int i; + for (i = 0; i < max; i++) + sg->sg[i].sg_magic = SG_MAGIC; + sg->num = 0x; + sg->dma_num = 0x; +#endif + INIT_LIST_HEAD(>list); + sg->max = max; + /* FIXME: This is to clear the page bits. */ + sg_init_table(sg->sg, sg->max); +} + +/** + * sg_ring_single - initialize a one-element scatterlist ring. + * @sg: the sg_ring. + * @buf: the pointer to the buffer. + * @buflen: the length of the buffer. + * + * Does sg_ring_init and also sets up first (and only) sg element. + */ +static inline void sg_ring_single(struct sg_ring *sg, + const void *buf, + unsigned int buflen) +{ + sg_ring_init(sg, 1); + sg->num = 1; + sg_init_one(>sg[0], buf, buflen); +} + +/** + * sg_ring_next - next array in a scatterlist ring. + * @sg: the sg_ring. + * @head: the sg_ring head. + * + * This will return NULL once @sg has looped back around to @head. + */ +static inline struct sg_ring *sg_ring_next(const struct sg_ring *sg, + const struct sg_ring *head) +{ + sg = list_first_entry(>list, struct sg_ring, list); + if (sg == head) + sg = NULL; + return (struct sg_ring *)sg; +} + +/* Helper for writing for loops. */ +static inline struct sg_ring *sg_ring_iter(const struct sg_ring *head, + const struct sg_ring *sg, + unsigned int *i) +{ + (*i)++; + /* While loop lets us skip any zero-entry sg_ring arrays */ + while (*i == sg->num) { + *i = 0; + sg = sg_ring_next(sg, head); + if (!sg) + break; + } + return (struct sg_ring *)sg; +} + +/** + * sg_ring_for_each
Re: Linux 2.6.24-rc6
On Thu, Dec 20, 2007 at 05:41:09PM -0800, Linus Torvalds wrote: > The regression list keeps shrinking, so we're still on track for a full > 2.6.24 release in early January. Assuming we don't all overeat during the > holidays and nobody gets any work done. But we all know that the holidays > are really the time when we get away from the boring "real work", and can > spend 24/7 on kernel hacking instead, right? > The patch-2.6.24-rc6.bz2 doesn't seem to apply to a pristine linux-2.6.23 tree? I see this while updating Fedora: + '[' '!' -f /home/kyle/rpms/kernel/devel/patch-2.6.24-rc6.bz2 ']' + case "$patch" in + bunzip2 + patch -p1 -F1 -s 1 out of 3 hunks FAILED -- saving rejects to file drivers/video/mbx/reg_bits.h.rej error: Bad exit status from /var/tmp/rpm-tmp.22316 (%prep) cheers, Kyle -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
Robert Hancock wrote: I have to wonder why certain system designers then didn't follow their strong recommendation.. I don't think I want to go there. I used to be a hardware/firmware guy. :D :D -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.24-rc6
On Thu, 2007-12-20 at 17:41 -0800, Linus Torvalds wrote: > The most noticeable part here (both to users and in the diffstat) should > be the libata-acpi fixes by Tejun Heo, which should hopefully take care of > all of the regressions that were caused by teaching SATA about doing the > proper ACPI stuff at bootup/suspend/resume/shutdown. > > Other changes visible in the diffstat are a couple of new watchdog drivers > and the removal of the old tipar driver, and some Korean translations of > the kernel docs. And some V4L videobuf changes. > > Other than that, it's pretty much a lot of small fixes (maybe not > one-liners, but we're talking "a few lines"). Networking, USB, scsi, > wireless, infiniband, IDE... With some alpha, ia64 and x86 arch updates. > > The regression list keeps shrinking, so we're still on track for a full > 2.6.24 release in early January. Assuming we don't all overeat during the > holidays and nobody gets any work done. But we all know that the holidays > are really the time when we get away from the boring "real work", and can > spend 24/7 on kernel hacking instead, right? > > Here's to a merry christmas, doing the whole druidic festival around the > tree thing, When my automation testing system applied it to 2.6.23, below error stopped the testing. *** Hunk #3 FAILED at 534. 1 out of 3 hunks FAILED -- saving rejects to file drivers/video/mbx/reg_bits.h.rej patching file drivers/video/mbx/regs.h -yanmin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] misc: Removal of final callers using fastcall
On Wed, 12 Dec 2007 15:38:26 -0800 Harvey Harrison <[EMAIL PROTECTED]> wrote: > Andrew, I'm not sure who is best to hit with these final dribs and > drabs removing fastcall. Once all of these have hit Linus' tree > I will send a final patch deleting the include/linux/linkage.h > definitions as well as any remaining occurances. Yes, that's a good approach, thanks. Wait until the tree is fastcall-clean and then kill the definition(s). I think I skipped rather a lot of remove-fastcall patches because a) suitable maintainers were cc'ed and b) I was going through a suicidal-over-bug-reports phase. Please keep them coming - I've always disliked fastcall. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Jan Beulich] [PATCH] constify tables in kernel/sysctl_check.c
On Thu, Dec 20, 2007 at 04:14:05PM -0700, Eric W. Biederman wrote: > Remains the question whether it is intended that many, perhaps even > large, tables are compiled in without ever having a chance to get used, > i.e. whether there shouldn't #ifdef CONFIG_xxx get added. > -static struct trans_ctl_table trans_net_ax25_param_table[] = { > +static const struct trans_ctl_table trans_net_ax25_table[] = { we lost the _param, which will cause a duplicate definition with .. > -static struct trans_ctl_table trans_net_ax25_table[] = { > +static const struct trans_ctl_table trans_net_ax25_table[] = { cut-n-paste thinko ? Dave -- http://www.codemonkey.org.uk -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] scsi: Use new __dma_buffer to align sense buffer in scsi_cmnd
The sense buffer ins scsi_cmnd can nowadays be DMA'ed into directly by some low level drivers (that typically happens with USB mass storage). This is a problem on non cache coherent architectures such as embedded PowerPCs where the sense buffer can share cache lines with other structure members, which leads to various forms of corruption. This uses the newly defined __dma_buffer annotation to enforce that on such platforms, the sense_buffer is contained within its own cache line. This has no effect on cache coherent architectures. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> --- include/scsi/scsi_cmnd.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-merge.orig/include/scsi/scsi_cmnd.h 2007-12-21 13:07:14.0 +1100 +++ linux-merge/include/scsi/scsi_cmnd.h2007-12-21 13:07:29.0 +1100 @@ -88,7 +88,7 @@ struct scsi_cmnd { working on */ #define SCSI_SENSE_BUFFERSIZE 96 - unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE]; + unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE] __dma_buffer; /* obtained by REQUEST SENSE when * CHECK CONDITION is received on original * command (auto-sense) */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] DMA buffer alignment annotations
This patch based on some earlier work by Roland Dreier introduces a pair of annotations that can be used to enforce alignment of objects that can be DMA'ed into, and to enforce that an DMA'able object within a structure isn't sharing a cache line with some other object. Such sharing of a data structure between DMA and non-DMA objects isn't a recommended practice, but it does happen and in some case might even make sense, so we now have a way to make it work propertly. The current patch only enables such alignment for some PowerPC platforms that do not have coherent caches. Other platforms such as ARM, MIPS, etc... can define ARCH_MIN_DMA_ALIGNMENT if they want to benefit from this, I don't know them well enough to do it myself. The initial issue I'm fixing (in a second patch) by using these is the SCSI sense buffer which is currently part of the scsi command structure and can be DMA'ed to. On non-coherent platforms, this causes various corruptions as this cache line is shared with various other fields of the scsi_cmnd data structure. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> --- Documentation/DMA-mapping.txt | 32 include/asm-generic/page.h| 10 ++ include/asm-powerpc/page.h|8 3 files changed, 50 insertions(+) --- linux-merge.orig/include/asm-generic/page.h 2007-07-27 13:44:45.0 +1000 +++ linux-merge/include/asm-generic/page.h 2007-12-21 13:07:28.0 +1100 @@ -20,6 +20,16 @@ static __inline__ __attribute_const__ in return order; } +#ifndef ARCH_MIN_DMA_ALIGNMENT +#define __dma_aligned +#define __dma_buffer +#else +#define __dma_aligned __attribute__((aligned(ARCH_MIN_DMA_ALIGNMENT))) +#define __dma_buffer __dma_buffer_line(__LINE__) +#define __dma_buffer_line(line)__dma_aligned;\ + char __dma_pad_##line[0] __dma_aligned +#endif + #endif /* __ASSEMBLY__ */ #endif /* __KERNEL__ */ Index: linux-merge/include/asm-powerpc/page.h === --- linux-merge.orig/include/asm-powerpc/page.h 2007-09-28 11:42:10.0 +1000 +++ linux-merge/include/asm-powerpc/page.h 2007-12-21 13:15:02.0 +1100 @@ -77,6 +77,14 @@ #define VM_DATA_DEFAULT_FLAGS64(VM_READ | VM_WRITE | \ VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) +/* + * On non cache coherent platforms, we enforce cache aligned DMA + * buffers inside of structures + */ +#ifdef CONFIG_NOT_COHERENT_CACHE +#define ARCH_MIN_DMA_ALIGNMENT L1_CACHE_BYTES +#endif + #ifdef __powerpc64__ #include #else Index: linux-merge/Documentation/DMA-mapping.txt === --- linux-merge.orig/Documentation/DMA-mapping.txt 2007-12-21 13:17:14.0 +1100 +++ linux-merge/Documentation/DMA-mapping.txt 2007-12-21 13:20:00.0 +1100 @@ -75,6 +75,38 @@ What about block I/O and networking buff networking subsystems make sure that the buffers they use are valid for you to DMA from/to. +Note that on non-cache-coherent architectures, having a DMA buffer +that shares a cache line with other data can lead to memory +corruption. + +The __dma_buffer macro exists to allow safe DMA buffers to be declared +easily and portably as part of larger structures without causing bloat +on cache-coherent architectures. To get this macro, architectures have +to define ARCH_MIN_DMA_ALIGNMENT to the requested alignment value in +their asm/page.h before including asm-generic/page.h + +Of course these structures must be contained in memory that can be +used for DMA as described above. + +To use __dma_buffer, just declare a struct like: + + struct mydevice { + int field1; + char buffer[BUFFER_SIZE] __dma_buffer; + int field2; + }; + +If this is used in code like: + + struct mydevice *dev; + dev = kmalloc(sizeof *dev, GFP_KERNEL); + +then dev->buffer will be safe for DMA on all architectures. On a +cache-coherent architecture the members of dev will be aligned exactly +as they would have been without __dma_buffer; on a non-cache-coherent +architecture buffer and field2 will be aligned so that buffer does not +share a cache line with any other data. + DMA addressing limitations Does your device have any DMA addressing limitations? For example, is -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] sg_ring for scsi
On Fri, 21 Dec 2007 10:13:38 +1100 Rusty Russell <[EMAIL PROTECTED]> wrote: > On Thursday 20 December 2007 18:58:07 David Miller wrote: > > From: Rusty Russell <[EMAIL PROTECTED]> > > Date: Thu, 20 Dec 2007 18:53:48 +1100 > > > > > Manipulating the magic chains is horrible; it looks simple to the > > > places which simply want to iterate through it, but it's awful for > > > code which wants to create them. > > > > I'm not saying complexity is inherent in this stuff, but > > assuming that it is the complexity should live as far away > > from the minions (the iterators in this case). Therefore, > > the creators is the right spot for the hard stuff. > > In this case, the main benefit of the sg chaining was that the conversion of > most scsi drivers was easy (basically sg++ -> sg = sg_next(sg)). The > conversion to sg_ring is more complex, but the end result is not > significantly more complex. > > However, the cost to code which manipulates sg chains was significant: I > tried > using them in virtio and it was too ugly to live (so that doesn't support sg > chaining). If this was the best we could do, that'd be fine. > > But, as demonstrated, there are real benefits of having an explicit header: I'm not sure about chaining the headers (as your sg_ring and scsi_sgtable do) would simplify LLDs. Have you looked at ips or qla1280? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI resource problems caused by improper address rounding
> So in your case, it should *result* in the exact same situation that your > patch did, but at the same time, when dealing with the (more common) case > of smaller allocations, we still continue to try to avoid being too close > to the top-of-memory. > > So it's not perfect, but perhaps it is a good compromise between being > careful and having to make room? > > Does this work for your case? I'm not totally happy with changing the generic code like that, to possibly not enforce "min" anymore. Other archs may have very good reasons to provide a min value here... Though at the same time, at least on powerpc, the parent resource of the host bridge will be the real limit, so that may not be a big issue. Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/5] dma_map_sg_ring() helper
On Thu, 20 Dec 2007 16:40:00 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: > From: Rusty Russell <[EMAIL PROTECTED]> > Date: Fri, 21 Dec 2007 11:35:12 +1100 > > > On Friday 21 December 2007 11:00:27 FUJITA Tomonori wrote: > > > We need to pass the whole sg entries to the IOMMUs at a time. > > > > Hi Fujita, > > > > OK, it's certainly possible to have an arch override. For which > > architecture is this BTW? > > SPARC64, POWERPC, maybe IA-64 etc. And x86_64, Alpha, and PARISC. > Basically any platform that potentially does virtual > remamping and thus linearization. > > I think it should always be provided, the new APIs give > less information to the implementation and that's a step > backwards. Agreed. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
Tony Camuso wrote: Robert Hancock wrote: First off, I would like to see confirmation from the horses's mouths here (namely AMD, ServerWorks/Broadcom, and whoever else) that there is no other way to get around this problem than disabling MMCONFIG for accesses behind those chips. I happen to have this one stored in my desktop. From AMD-8132TM HyperTransportTM PCI-X®2.0 Tunnel Revision Guide http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30801.pdf 79 AMD-8132TM Tunnel Lacks Extended Configuration Space Memory-Mapped I/O Base Address Register Description Current AMD processors do not natively support PCI-defined extended configuration space. A memory mapped I/O base address register (MMIO BAR) is required in chipset devices to support extended configuration space. The AMD-8132 does not have this MMIO BAR. Potential Effect On System The AMD-8132 is a PCI-X® Mode 2 capable device and requires the MMIO BAR to support extended configuration space. Using a device which does have this MMIO BAR and an AMD-8132 on the same HyperTransportTM link of the processor may cause firmware/software problems. The base configuration space of the AMD-8132 and PCI(-X) devices attached to it are accessible using only the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices attached to the AMD-8132 in the extended configuration space are not accessible. The AMD-8132 has no registers in the extended configuration space. Suggested Workaround It is strongly recommended that system designers do not connect the AMD-8132 and devices that use extended configuration space MMIO BARs (ex: HyperTransport-to-PCI Express® bridges) to the same processor HyperTransport link. Fix Planned No That does sound fairly definitive. I have to wonder why certain system designers then didn't follow their strong recommendation.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Trailing periods in kernel messages
On Fri, 21 Dec 2007 02:43:33 +0100 Frans Pop <[EMAIL PROTECTED]> wrote: > On Thursday 20 December 2007, Alan Cox wrote: > > The kernel printk messages are sentences. > > I'm afraid that I completely and utterly disagree. Kernel messages are _not_ > sentences. The vast majority is not well-formed and does not contain any of > the elements that are required for a proper sentence. > > The most kernel messages can be compared to is a rather diverse and sloppy > enumeration. And enumerations follow completely different rules than > sentences. It can better be characterized as a "semi-random sequence of > context-sensitive technical messages". > > IMHO the existing rule that "Kernel messages do not have to be terminated > with a period." is completely justified, though it does need some minor > clarification on the cases in which proper punctuation _should_ be > followed. No-period is a kernel idiom, produces perfectly readable output, I have never ever heard of anyone expressing the least concern over a lack of dots at the end of their printks and 91% of kernel code agrees. otoh the place where no-dots comes horridly unstuck is if a single printk contains two sentences: printk("My computer caught on fire. I hope yours does too\n"); that's really daft. It's very rare though. Of course one could always patch syslogd to add the dots, or change printk and add an i_am_anal=1 kernel boot option. Andy, please have an accident with that checkpatch change and let's hope like hell that nobody starts trying to "fix" any of this. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
Robert Hancock wrote: The case of the device built into the K8 northbridge that's unreachable by MMCONFIG kind of makes sense, since the northbridge is what's translating the MMCONFIG memory access into config accesses. It seems bizarre to me that a bridge chip could possibly have such a problem. The MMCONFIG access should get translated into a configuration space access in the northbridge and from that point on there's no difference between an MMCONFIG and type1 access. Robert's point is well taken. Only northbridge chips can give us this kind of trouble, and the only chips mentioned in the present discussion as not being mmconf-compliant are northbridges (8132, ht1000). The patch is aware of this, so once a root bus has been programmed for legacy pci config access, all descendent buses automatically inherit this access mechanism and are therefore not probed by the patch. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
On 12/20/2007 6:21 PM, Tony Camuso wrote: > > And the MMCONFIG problem with enterprise systems and workstations, where > we do control the BIOS (for the most part), is due to known bugs in > certain versions of certain chipsets, HT1000, AMD8132, among them, not > the BIOS. The lack of MMCONFIG support is indeed because some hypertransport chipsets lack that support. But there are some BIOSes out there that are advertising support for all busses in their MCFG acpi attribute (even the busses managed by some amd8131 in a mixed nvidia-ck804/amd8131 motherboard), and the BIOS seems at least faulty for advertising a capability that does not exist. Loic -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: almost daily Kernel oops with 2.6.23.9 - and now 2.6.23.11 as well
Ok, so after the holidays I will do the following: let memtest86+ run several hours. do a full backup to switch to r3 and build an unpatched kernel. see if I can reproduce the oops with .21 and .22 (because AFAIR no oops with 21.. but I might be wrong). Not exactly in that order. Glück Auf Volker ps: please cc me. I am not subscribed to lkml. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 08/24] Text Edit Lock - kprobes x86_32
Make kprobes use INIT_ARRAY(). Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> Tested-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> CC: Andi Kleen <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: Thomas Gleixner <[EMAIL PROTECTED]> CC: Ingo Molnar <[EMAIL PROTECTED]> CC: H. Peter Anvin <[EMAIL PROTECTED]> --- arch/x86/kernel/kprobes_32.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) Index: linux-2.6-lttng/arch/x86/kernel/kprobes_32.c === --- linux-2.6-lttng.orig/arch/x86/kernel/kprobes_32.c 2007-11-13 09:45:35.0 -0500 +++ linux-2.6-lttng/arch/x86/kernel/kprobes_32.c2007-11-13 09:45:44.0 -0500 @@ -176,12 +176,13 @@ int __kprobes arch_prepare_kprobe(struct void __kprobes arch_arm_kprobe(struct kprobe *p) { - text_poke(p->addr, ((unsigned char []){BREAKPOINT_INSTRUCTION}), 1); + text_poke(p->addr, INIT_ARRAY(unsigned char, BREAKPOINT_INSTRUCTION, 1), + 1); } void __kprobes arch_disarm_kprobe(struct kprobe *p) { - text_poke(p->addr, >opcode, 1); + text_poke(p->addr, INIT_ARRAY(unsigned char, p->opcode, 1), 1); } void __kprobes arch_remove_kprobe(struct kprobe *p) -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug 9182] Critical memory leak (dirty pages)
On Friday 21 December 2007 06:24, Linus Torvalds wrote: > On Thu, 20 Dec 2007, Jan Kara wrote: > > As I wrote in my previous email, this solution works but hides the > > fact that the page really *has* dirty data in it and *is* pinned in > > memory until the commit code gets to writing it. So in theory it could > > disturb the writeout logic by having more dirty data in memory than vm > > thinks it has. Not that I'd have a better fix now but I wanted to point > > out this problem. > > Well, I worry more about the VM being sane - and by the time we actually > hit this case, as far as VM sanity is concerned, the page no longer really > exists. It's been removed from the page cache, and it only really exists > as any other random kernel allocation. It does allow the VM to just not worry about this. However I don't really like this kinds of catch-all conditions that are hard to get rid of and can encourage bad behaviour. It would be nice if the "insane" things were made to clean up after themselves. > The fact that low-level filesystems (in this case ext3 journaling) do > their own insane things is not something the VM even _should_ care about. > It's just an internal FS allocation, and the FS can do whatever the hell > it wants with it, including doing IO etc. > > The kernel doesn't consider any other random IO pages to be "dirty" either > (eg if you do direct-IO writes using low-level SCSI commands, the VM > doesn't consider that to be any special dirty stuff, it's just random page > allocations again). This is really no different. > > In other words: the Linux "VM" subsystem is really two differnt parts: the > low-level page allocator (which obviously knows that the page is still in > *use*, since it hasn't been free'd), and the higher-level file mapping and > caching stuff that knows about things like page "dirtyiness". And once > you've done a "remove_from_page_cache()", the higher levels are no longer > involved, and dirty accounting simply doesn't get into the picture. That's all true... it would simply be nice to ask the filesystems to do this. But anyway I think your patch is pretty reasonable for the moment. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 14/24] Immediate Values - x86 Optimization
x86 optimization of the immediate values which uses a movl with code patching to set/unset the value used to populate the register used as variable source. Changelog: - Use text_poke_early with cr0 WP save/restore to patch the bypass. We are doing non atomic writes to a code region only touched by us (nobody can execute it since we are protected by the imv_mutex). - Put imv_set and _imv_set in the architecture independent header. - Use $0 instead of %2 with (0) operand. - Add x86_64 support, ready for i386+x86_64 -> x86 merge. - Use asm-x86/asm.h. Ok, so the most flexible solution that I see, that should fit for both i386 and x86_64 would be : 1 byte : "=Q" : Any register accessible as rh: a, b, c, and d. 2, 4 bytes : "=R" : Legacy registerâthe eight integer registers available on all i386 processors (a, b, c, d, si, di, bp, sp). 8 bytes : (only for x86_64) "=r" : A register operand is allowed provided that it is in a general register. That should make sure x86_64 won't try to use REX prefixed opcodes for 1, 2 and 4 bytes values. - Create the instruction in a discarded section to calculate its size. This is how we can align the beginning of the instruction on an address that will permit atomic modificatino of the immediate value without knowing the size of the opcode used by the compiler. - Bugfix : 8 bytes 64 bits immediate value was declared as "4 bytes" in the immediate structure. - Change the immediate.c update code to support variable length opcodes. - Vastly simplified, using a busy looping IPI with interrupts disabled. Does not protect against NMI nor MCE. - Pack the __imv section. Use smallest types required for size (char). - Use imv_* instead of immediate_*. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> CC: Andi Kleen <[EMAIL PROTECTED]> CC: "H. Peter Anvin" <[EMAIL PROTECTED]> CC: Chuck Ebbert <[EMAIL PROTECTED]> CC: Christoph Hellwig <[EMAIL PROTECTED]> CC: Jeremy Fitzhardinge <[EMAIL PROTECTED]> CC: Thomas Gleixner <[EMAIL PROTECTED]> CC: Ingo Molnar <[EMAIL PROTECTED]> CC: Rusty Russell <[EMAIL PROTECTED]> --- arch/x86/Kconfig|1 include/asm-x86/immediate.h | 77 2 files changed, 78 insertions(+) Index: linux-2.6-lttng/include/asm-x86/immediate.h === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6-lttng/include/asm-x86/immediate.h 2007-11-21 11:04:33.0 -0500 @@ -0,0 +1,77 @@ +#ifndef _ASM_X86_IMMEDIATE_H +#define _ASM_X86_IMMEDIATE_H + +/* + * Immediate values. x86 architecture optimizations. + * + * (C) Copyright 2006 Mathieu Desnoyers <[EMAIL PROTECTED]> + * + * This file is released under the GPLv2. + * See the file COPYING for more details. + */ + +#include + +/** + * imv_read - read immediate variable + * @name: immediate value name + * + * Reads the value of @name. + * Optimized version of the immediate. + * Do not use in __init and __exit functions. Use _imv_read() instead. + * If size is bigger than the architecture long size, fall back on a memory + * read. + * + * Make sure to populate the initial static 64 bits opcode with a value + * what will generate an instruction with 8 bytes immediate value (not the REX.W + * prefixed one that loads a sign extended 32 bits immediate value in a r64 + * register). + */ +#define imv_read(name) \ + ({ \ + __typeof__(name##__imv) value; \ + BUILD_BUG_ON(sizeof(value) > 8);\ + switch (sizeof(value)) {\ + case 1: \ + asm(".section __imv,\"a\",@progbits\n\t"\ + _ASM_PTR "%c1, (3f)-%c2\n\t"\ + ".byte %c2\n\t" \ + ".previous\n\t" \ + "mov $0,%0\n\t" \ + "3:\n\t"\ + : "=q" (value) \ + : "i" (##__imv), \ + "i" (sizeof(value))); \ + break; \ + case 2: \ + case 4: \ + asm(".section __imv,\"a\",@progbits\n\t"\ + _ASM_PTR "%c1, (3f)-%c2\n\t"\ + ".byte %c2\n\t" \ +
[patch 10/24] Text Edit Lock - x86_32 standardize debug rodata
Standardize DEBUG_RODATA, removing special cases for hotplug and kprobes. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> CC: Andi Kleen <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] CC: Thomas Gleixner <[EMAIL PROTECTED]> CC: Ingo Molnar <[EMAIL PROTECTED]> CC: H. Peter Anvin <[EMAIL PROTECTED]> --- arch/x86/mm/init_32.c | 20 +++- 1 file changed, 7 insertions(+), 13 deletions(-) Index: linux-2.6-lttng/arch/x86/mm/init_32.c === --- linux-2.6-lttng.orig/arch/x86/mm/init_32.c 2007-11-13 09:25:29.0 -0500 +++ linux-2.6-lttng/arch/x86/mm/init_32.c 2007-11-13 09:45:48.0 -0500 @@ -784,28 +784,21 @@ static int noinline do_test_wp_bit(void) } #ifdef CONFIG_DEBUG_RODATA - void mark_rodata_ro(void) { unsigned long start = PFN_ALIGN(_text); unsigned long size = PFN_ALIGN(_etext) - start; -#ifndef CONFIG_KPROBES -#ifdef CONFIG_HOTPLUG_CPU - /* It must still be possible to apply SMP alternatives. */ - if (num_possible_cpus() <= 1) -#endif - { - change_page_attr(virt_to_page(start), -size >> PAGE_SHIFT, PAGE_KERNEL_RX); - printk("Write protecting the kernel text: %luk\n", size >> 10); - } -#endif + change_page_attr(virt_to_page(start), + size >> PAGE_SHIFT, PAGE_KERNEL_RX); + printk(KERN_INFO "Write protecting the kernel text: %luk\n", + size >> 10); + start += size; size = (unsigned long)__end_rodata - start; change_page_attr(virt_to_page(start), size >> PAGE_SHIFT, PAGE_KERNEL_RO); - printk("Write protecting the kernel read-only data: %luk\n", + printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n", size >> 10); /* @@ -816,6 +809,7 @@ void mark_rodata_ro(void) */ global_flush_tlb(); } + #endif void free_init_pages(char *what, unsigned long begin, unsigned long end) -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 02/24] Kprobes - do not use kprobes mutex in arch code
Remove the kprobes mutex from kprobes.h, since it does not belong there. Also remove all use of this mutex in the architecture specific code, replacing it by a proper mutex lock/unlock in the architecture agnostic code. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> Acked-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] --- arch/ia64/kernel/kprobes.c|2 -- arch/powerpc/kernel/kprobes.c |2 -- arch/s390/kernel/kprobes.c|2 -- arch/x86/kernel/kprobes_32.c |2 -- arch/x86/kernel/kprobes_64.c |2 -- include/linux/kprobes.h |2 -- kernel/kprobes.c |2 ++ 7 files changed, 2 insertions(+), 12 deletions(-) Index: linux-2.6-lttng/include/linux/kprobes.h === --- linux-2.6-lttng.orig/include/linux/kprobes.h2007-12-10 09:53:27.0 -0500 +++ linux-2.6-lttng/include/linux/kprobes.h 2007-12-12 18:10:34.0 -0500 @@ -35,7 +35,6 @@ #include #include #include -#include #ifdef CONFIG_KPROBES #include @@ -183,7 +182,6 @@ static inline void kretprobe_assert(stru } extern spinlock_t kretprobe_lock; -extern struct mutex kprobe_mutex; extern int arch_prepare_kprobe(struct kprobe *p); extern void arch_arm_kprobe(struct kprobe *p); extern void arch_disarm_kprobe(struct kprobe *p); Index: linux-2.6-lttng/arch/x86/kernel/kprobes_32.c === --- linux-2.6-lttng.orig/arch/x86/kernel/kprobes_32.c 2007-12-10 09:53:27.0 -0500 +++ linux-2.6-lttng/arch/x86/kernel/kprobes_32.c2007-12-12 18:10:34.0 -0500 @@ -186,9 +186,7 @@ void __kprobes arch_disarm_kprobe(struct void __kprobes arch_remove_kprobe(struct kprobe *p) { - mutex_lock(_mutex); free_insn_slot(p->ainsn.insn, (p->ainsn.boostable == 1)); - mutex_unlock(_mutex); } static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb) Index: linux-2.6-lttng/kernel/kprobes.c === --- linux-2.6-lttng.orig/kernel/kprobes.c 2007-12-12 18:10:32.0 -0500 +++ linux-2.6-lttng/kernel/kprobes.c2007-12-12 18:10:34.0 -0500 @@ -644,7 +644,9 @@ valid_p: list_del_rcu(>list); kfree(old_p); } + mutex_lock(_mutex); arch_remove_kprobe(p); + mutex_unlock(_mutex); } else { mutex_lock(_mutex); if (p->break_handler) Index: linux-2.6-lttng/arch/ia64/kernel/kprobes.c === --- linux-2.6-lttng.orig/arch/ia64/kernel/kprobes.c 2007-12-12 18:06:06.0 -0500 +++ linux-2.6-lttng/arch/ia64/kernel/kprobes.c 2007-12-12 18:10:34.0 -0500 @@ -582,9 +582,7 @@ void __kprobes arch_disarm_kprobe(struct void __kprobes arch_remove_kprobe(struct kprobe *p) { - mutex_lock(_mutex); free_insn_slot(p->ainsn.insn, 0); - mutex_unlock(_mutex); } /* * We are resuming execution after a single step fault, so the pt_regs Index: linux-2.6-lttng/arch/powerpc/kernel/kprobes.c === --- linux-2.6-lttng.orig/arch/powerpc/kernel/kprobes.c 2007-12-10 09:53:27.0 -0500 +++ linux-2.6-lttng/arch/powerpc/kernel/kprobes.c 2007-12-12 18:10:34.0 -0500 @@ -88,9 +88,7 @@ void __kprobes arch_disarm_kprobe(struct void __kprobes arch_remove_kprobe(struct kprobe *p) { - mutex_lock(_mutex); free_insn_slot(p->ainsn.insn, 0); - mutex_unlock(_mutex); } static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs *regs) Index: linux-2.6-lttng/arch/s390/kernel/kprobes.c === --- linux-2.6-lttng.orig/arch/s390/kernel/kprobes.c 2007-12-10 09:53:27.0 -0500 +++ linux-2.6-lttng/arch/s390/kernel/kprobes.c 2007-12-12 18:10:34.0 -0500 @@ -220,9 +220,7 @@ void __kprobes arch_disarm_kprobe(struct void __kprobes arch_remove_kprobe(struct kprobe *p) { - mutex_lock(_mutex); free_insn_slot(p->ainsn.insn, 0); - mutex_unlock(_mutex); } static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs *regs) Index: linux-2.6-lttng/arch/x86/kernel/kprobes_64.c === --- linux-2.6-lttng.orig/arch/x86/kernel/kprobes_64.c 2007-12-10 09:53:27.0 -0500 +++ linux-2.6-lttng/arch/x86/kernel/kprobes_64.c2007-12-12 18:10:34.0 -0500 @@ -225,9 +225,7 @@ void __kprobes arch_disarm_kprobe(struct void __kprobes arch_remove_kprobe(struct kprobe *p) { - mutex_lock(_mutex); free_insn_slot(p->ainsn.insn, 0); - mutex_unlock(_mutex); } static void
[patch 12/24] Immediate Values - Architecture Independent Code
Immediate values are used as read mostly variables that are rarely updated. They use code patching to modify the values inscribed in the instruction stream. It provides a way to save precious cache lines that would otherwise have to be used by these variables. There is a generic _imv_read() version, which uses standard global variables, and optimized per architecture imv_read() implementations, which use a load immediate to remove a data cache hit. When the immediate values functionnality is disabled in the kernel, it falls back to global variables. It adds a new rodata section "__imv" to place the pointers to the enable value. Immediate values activation functions sits in kernel/immediate.c. Immediate values refer to the memory address of a previously declared integer. This integer holds the information about the state of the immediate values associated, and must be accessed through the API found in linux/immediate.h. At module load time, each immediate value is checked to see if it must be enabled. It would be the case if the variable they refer to is exported from another module and already enabled. In the early stages of start_kernel(), the immediate values are updated to reflect the state of the variable they refer to. * Why should this be merged * It improves performances on heavy memory I/O workloads. An interesting result shows the potential this infrastructure has by showing the slowdown a simple system call such as getppid() suffers when it is used under heavy user-space cache trashing: Random walk L1 and L2 trashing surrounding a getppid() call: (note: in this test, do_syscal_trace was taken at each system call, see Documentation/immediate.txt in these patches for details) - No memory pressure : getppid() takes 1573 cycles - With memory pressure : getppid() takes 15589 cycles We therefore have a slowdown of 10 times just to get the kernel variables from memory. Another test on the same architecture (Intel P4) measured the memory latency to be 559 cycles. Therefore, each cache line removed from the hot path would improve the syscall time of 3.5% in these conditions. Changelog: - section __imv is already SHF_ALLOC - Because of the wonders of ELF, section 0 has sh_addr and sh_size 0. So the if (immediateindex) is unnecessary here. - Remove module_mutex usage: depend on functions implemented in module.c for that. - Does not update tainted module's immediate values. - remove imv_*_t types, add DECLARE_IMV() and DEFINE_IMV(). - imv_read() becomes imv_read(var) because of this. - Adding a new EXPORT_IMV_SYMBOL(_GPL). - remove imv_if(). Should use if (unlikely(imv_read(var))) instead. - Wait until we have gcc support before we add the imv_if macro, since its form may have to change. - Dont't declare the __imv section in vmlinux.lds.h, just put the content in the rodata section. - Simplify interface : remove imv_set_early, keep track of kernel boot status internally. - Remove the ALIGN(8) before the __imv section. It is packed now. - Uses an IPI busy-loop on each CPU with interrupts disabled as a simple, architecture agnostic, update mechanism. - Use imv_* instead of immediate_*. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> CC: Rusty Russell <[EMAIL PROTECTED]> --- include/asm-generic/vmlinux.lds.h |3 include/linux/immediate.h | 94 +++ include/linux/module.h| 16 +++ init/main.c |8 + kernel/Makefile |1 kernel/immediate.c| 187 ++ kernel/module.c | 50 +- 7 files changed, 358 insertions(+), 1 deletion(-) Index: linux-2.6-lttng/include/linux/immediate.h === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6-lttng/include/linux/immediate.h 2007-11-28 09:32:04.0 -0500 @@ -0,0 +1,94 @@ +#ifndef _LINUX_IMMEDIATE_H +#define _LINUX_IMMEDIATE_H + +/* + * Immediate values, can be updated at runtime and save cache lines. + * + * (C) Copyright 2007 Mathieu Desnoyers <[EMAIL PROTECTED]> + * + * This file is released under the GPLv2. + * See the file COPYING for more details. + */ + +#ifdef CONFIG_IMMEDIATE + +struct __imv { + unsigned long var; /* Pointer to the identifier variable of the +* immediate value +*/ + unsigned long imv; /* +* Pointer to the memory location of the +* immediate value within the instruction. +*/ + unsigned char size; /* Type size. */ +} __attribute__ ((packed)); + +#include + +/** + * imv_set - set immediate variable (with locking) + * @name: immediate value name + * @i: required value + * + * Sets the value of @name, taking the module_mutex if required by + * the architecture. + */ +#define
[patch 07/24] Text Edit Lock - kprobes architecture independent support
Use the mutual exclusion provided by the text edit lock in the kprobes code. It allows coherent manipulation of the kernel code by other subsystems. Changelog: Move the kernel_text_lock/unlock out of the for loops. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> Acked-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: Roel Kluin <[EMAIL PROTECTED]> --- kernel/kprobes.c | 19 +-- 1 file changed, 13 insertions(+), 6 deletions(-) Index: linux-2.6-lttng/kernel/kprobes.c === --- linux-2.6-lttng.orig/kernel/kprobes.c 2007-11-16 13:40:06.0 -0500 +++ linux-2.6-lttng/kernel/kprobes.c2007-11-17 10:00:23.0 -0500 @@ -43,6 +43,7 @@ #include #include #include +#include #include #include @@ -568,9 +569,10 @@ static int __kprobes __register_kprobe(s goto out; } + kernel_text_lock(); ret = arch_prepare_kprobe(p); if (ret) - goto out; + goto out_unlock_text; INIT_HLIST_NODE(>hlist); hlist_add_head_rcu(>hlist, @@ -578,7 +580,8 @@ static int __kprobes __register_kprobe(s if (kprobe_enabled) arch_arm_kprobe(p); - +out_unlock_text: + kernel_text_unlock(); out: mutex_unlock(_mutex); @@ -621,8 +624,11 @@ valid_p: * enabled - otherwise, the breakpoint would already have * been removed. We save on flushing icache. */ - if (kprobe_enabled) + if (kprobe_enabled) { + kernel_text_lock(); arch_disarm_kprobe(p); + kernel_text_unlock(); + } hlist_del_rcu(_p->hlist); cleanup_p = 1; } else { @@ -644,9 +650,7 @@ valid_p: list_del_rcu(>list); kfree(old_p); } - mutex_lock(_mutex); arch_remove_kprobe(p); - mutex_unlock(_mutex); } else { mutex_lock(_mutex); if (p->break_handler) @@ -717,7 +721,6 @@ static int __kprobes pre_handler_kretpro ri->rp = rp; ri->task = current; arch_prepare_kretprobe(ri, regs); - /* XXX(hch): why is there no hlist_move_head? */ hlist_del(>uflist); hlist_add_head(>uflist, >rp->used_instances); @@ -938,11 +941,13 @@ static void __kprobes enable_all_kprobes if (kprobe_enabled) goto already_enabled; + kernel_text_lock(); for (i = 0; i < KPROBE_TABLE_SIZE; i++) { head = _table[i]; hlist_for_each_entry_rcu(p, node, head, hlist) arch_arm_kprobe(p); } + kernel_text_unlock(); kprobe_enabled = true; printk(KERN_INFO "Kprobes globally enabled\n"); @@ -967,6 +972,7 @@ static void __kprobes disable_all_kprobe kprobe_enabled = false; printk(KERN_INFO "Kprobes globally disabled\n"); + kernel_text_lock(); for (i = 0; i < KPROBE_TABLE_SIZE; i++) { head = _table[i]; hlist_for_each_entry_rcu(p, node, head, hlist) { @@ -974,6 +980,7 @@ static void __kprobes disable_all_kprobe arch_disarm_kprobe(p); } } + kernel_text_unlock(); mutex_unlock(_mutex); /* Allow all currently running kprobes to complete */ -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 16/24] Immediate Values - Powerpc Optimization
PowerPC optimization of the immediate values which uses a li instruction, patched with an immediate value. Changelog: - Put imv_set and _imv_set in the architecture independent header. - Pack the __imv section. Use smallest types required for size (char). - Remove architecture specific update code : now handled by architecture agnostic code. - Use imv_* instead of immediate_*. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> CC: Rusty Russell <[EMAIL PROTECTED]> CC: Christoph Hellwig <[EMAIL PROTECTED]> CC: Paul Mackerras <[EMAIL PROTECTED]> --- arch/powerpc/Kconfig|1 include/asm-powerpc/immediate.h | 55 2 files changed, 56 insertions(+) Index: linux-2.6-lttng/include/asm-powerpc/immediate.h === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6-lttng/include/asm-powerpc/immediate.h 2007-11-19 12:26:16.0 -0500 @@ -0,0 +1,55 @@ +#ifndef _ASM_POWERPC_IMMEDIATE_H +#define _ASM_POWERPC_IMMEDIATE_H + +/* + * Immediate values. PowerPC architecture optimizations. + * + * (C) Copyright 2006 Mathieu Desnoyers <[EMAIL PROTECTED]> + * + * This file is released under the GPLv2. + * See the file COPYING for more details. + */ + +#include + +/** + * imv_read - read immediate variable + * @name: immediate value name + * + * Reads the value of @name. + * Optimized version of the immediate. + * Do not use in __init and __exit functions. Use _imv_read() instead. + */ +#define imv_read(name) \ + ({ \ + __typeof__(name##__imv) value; \ + BUILD_BUG_ON(sizeof(value) > 8);\ + switch (sizeof(value)) {\ + case 1: \ + asm(".section __imv,\"a\",@progbits\n\t"\ + PPC_LONG "%c1, ((1f)-1)\n\t"\ + ".byte 1\n\t" \ + ".previous\n\t" \ + "li %0,0\n\t" \ + "1:\n\t"\ + : "=r" (value) \ + : "i" (##__imv)); \ + break; \ + case 2: \ + asm(".section __imv,\"a\",@progbits\n\t"\ + PPC_LONG "%c1, ((1f)-2)\n\t"\ + ".byte 2\n\t" \ + ".previous\n\t" \ + "li %0,0\n\t" \ + "1:\n\t"\ + : "=r" (value) \ + : "i" (##__imv)); \ + break; \ + case 4: \ + case 8: value = name##__imv;\ + break; \ + }; \ + value; \ + }) + +#endif /* _ASM_POWERPC_IMMEDIATE_H */ Index: linux-2.6-lttng/arch/powerpc/Kconfig === --- linux-2.6-lttng.orig/arch/powerpc/Kconfig 2007-11-19 12:25:21.0 -0500 +++ linux-2.6-lttng/arch/powerpc/Kconfig2007-11-19 12:26:01.0 -0500 @@ -81,6 +81,7 @@ config PPC default y select HAVE_OPROFILE select HAVE_KPROBES + select HAVE_IMMEDIATE config EARLY_PRINTK bool -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 17/24] Immediate Values - Documentation
Changelog: - Remove imv_set_early (removed from API). - Use imv_* instead of immediate_*. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> CC: Rusty Russell <[EMAIL PROTECTED]> --- Documentation/immediate.txt | 221 1 file changed, 221 insertions(+) Index: linux-2.6-lttng/Documentation/immediate.txt === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6-lttng/Documentation/immediate.txt 2007-11-03 20:28:58.0 -0400 @@ -0,0 +1,221 @@ + Using the Immediate Values + + Mathieu Desnoyers + + +This document introduces Immediate Values and their use. + + +* Purpose of immediate values + +An immediate value is used to compile into the kernel variables that sit within +the instruction stream. They are meant to be rarely updated but read often. +Using immediate values for these variables will save cache lines. + +This infrastructure is specialized in supporting dynamic patching of the values +in the instruction stream when multiple CPUs are running without disturbing the +normal system behavior. + +Compiling code meant to be rarely enabled at runtime can be done using +if (unlikely(imv_read(var))) as condition surrounding the code. The +smallest data type required for the test (an 8 bits char) is preferred, since +some architectures, such as powerpc, only allow up to 16 bits immediate values. + + +* Usage + +In order to use the "immediate" macros, you should include linux/immediate.h. + +#include + +DEFINE_IMV(char, this_immediate); +EXPORT_IMV_SYMBOL(this_immediate); + + +And use, in the body of a function: + +Use imv_set(this_immediate) to set the immediate value. + +Use imv_read(this_immediate) to read the immediate value. + +The immediate mechanism supports inserting multiple instances of the same +immediate. Immediate values can be put in inline functions, inlined static +functions, and unrolled loops. + +If you have to read the immediate values from a function declared as __init or +__exit, you should explicitly use _imv_read(), which will fall back on a +global variable read. Failing to do so will leave a reference to the __init +section after it is freed (it would generate a modpost warning). + +You can choose to set an initial static value to the immediate by using, for +instance: + +DEFINE_IMV(long, myptr) = 10; + + +* Optimization for a given architecture + +One can implement optimized immediate values for a given architecture by +replacing asm-$ARCH/immediate.h. + + +* Performance improvement + + + * Memory hit for a data-based branch + +Here are the results on a 3GHz Pentium 4: + +number of tests: 100 +number of branches per test: 10 +memory hit cycles per iteration (mean): 636.611 +L1 cache hit cycles per iteration (mean): 89.6413 +instruction stream based test, cycles per iteration (mean): 85.3438 +Just getting the pointer from a modulo on a pseudo-random value, doing + nothing with it, cycles per iteration (mean): 77.5044 + +So: +Base case: 77.50 cycles +instruction stream based test: +7.8394 cycles +L1 cache hit based test:+12.1369 cycles +Memory load based test: +559.1066 cycles + +So let's say we have a ping flood coming at +(14014 packets transmitted, 14014 received, 0% packet loss, time 1826ms) +7674 packets per second. If we put 2 markers for irq entry/exit, it +brings us to 15348 markers sites executed per second. + +(15348 exec/s) * (559 cycles/exec) / (3G cycles/s) = 0.0029 +We therefore have a 0.29% slowdown just on this case. + +Compared to this, the instruction stream based test will cause a +slowdown of: + +(15348 exec/s) * (7.84 cycles/exec) / (3G cycles/s) = 0.4 +For a 0.004% slowdown. + +If we plan to use this for memory allocation, spinlock, and all sorts of +very high event rate tracing, we can assume it will execute 10 to 100 +times more sites per second, which brings us to 0.4% slowdown with the +instruction stream based test compared to 29% slowdown with the memory +load based test on a system with high memory pressure. + + + + * Markers impact under heavy memory load + +Running a kernel with my LTTng instrumentation set, in a test that +generates memory pressure (from userspace) by trashing L1 and L2 caches +between calls to getppid() (note: syscall_trace is active and calls +a marker upon syscall entry and syscall exit; markers are disarmed). +This test is done in user-space, so there are some delays due to IRQs +coming and to the scheduler. (UP 2.6.22-rc6-mm1 kernel, task with -20 +nice level) + +My first set of results: Linear cache trashing, turned out not to be +very interesting, because it seems like the linearity of the memset on a +full array is somehow detected and it does not "really" trash the +caches. + +Now the most interesting result: Random walk L1 and L2 trashing +surrounding a getppid() call. + +- Markers compiled out (but
[patch 05/24] Text Edit Lock - Architecture Independent Code
This is an architecture independant synchronization around kernel text modifications through use of a global mutex. A mutex has been chosen so that kprobes, the main user of this, can sleep during memory allocation between the memory read of the instructions it must replace and the memory write of the breakpoint. Other user of this interface: immediate values. Paravirt and alternatives are always done when SMP is inactive, so there is no need to use locks. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> CC: Andi Kleen <[EMAIL PROTECTED]> --- include/linux/memory.h |7 +++ mm/memory.c| 34 ++ 2 files changed, 41 insertions(+) Index: linux-2.6-lttng/include/linux/memory.h === --- linux-2.6-lttng.orig/include/linux/memory.h 2007-11-07 11:11:26.0 -0500 +++ linux-2.6-lttng/include/linux/memory.h 2007-11-07 11:13:48.0 -0500 @@ -93,4 +93,11 @@ extern int memory_notify(unsigned long v #define hotplug_memory_notifier(fn, pri) do { } while (0) #endif +/* + * Take and release the kernel text modification lock, used for code patching. + * Users of this lock can sleep. + */ +extern void kernel_text_lock(void); +extern void kernel_text_unlock(void); + #endif /* _LINUX_MEMORY_H_ */ Index: linux-2.6-lttng/mm/memory.c === --- linux-2.6-lttng.orig/mm/memory.c2007-11-07 11:12:33.0 -0500 +++ linux-2.6-lttng/mm/memory.c 2007-11-07 11:14:25.0 -0500 @@ -50,6 +50,8 @@ #include #include #include +#include +#include #include #include @@ -84,6 +86,12 @@ EXPORT_SYMBOL(high_memory); int randomize_va_space __read_mostly = 1; +/* + * mutex protecting text section modification (dynamic code patching). + * some users need to sleep (allocating memory...) while they hold this lock. + */ +static DEFINE_MUTEX(text_mutex); + static int __init disable_randmaps(char *s) { randomize_va_space = 0; @@ -2748,3 +2756,29 @@ int access_process_vm(struct task_struct return buf - old_buf; } + +/** + * kernel_text_lock - Take the kernel text modification lock + * + * Insures mutual write exclusion of kernel and modules text live text + * modification. Should be used for code patching. + * Users of this lock can sleep. + */ +void __kprobes kernel_text_lock(void) +{ + mutex_lock(_mutex); +} +EXPORT_SYMBOL_GPL(kernel_text_lock); + +/** + * kernel_text_unlock - Release the kernel text modification lock + * + * Insures mutual write exclusion of kernel and modules text live text + * modification. Should be used for code patching. + * Users of this lock can sleep. + */ +void __kprobes kernel_text_unlock(void) +{ + mutex_unlock(_mutex); +} +EXPORT_SYMBOL_GPL(kernel_text_unlock); -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 03/24] Kprobes - declare kprobe_mutex static
Since it will not be used by other kernel objects, it makes sense to declare it static. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> Acked-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] --- kernel/kprobes.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6-lttng/kernel/kprobes.c === --- linux-2.6-lttng.orig/kernel/kprobes.c 2007-08-19 09:09:15.0 -0400 +++ linux-2.6-lttng/kernel/kprobes.c2007-08-19 17:18:07.0 -0400 @@ -68,7 +68,7 @@ static struct hlist_head kretprobe_inst_ /* NOTE: change this value only with kprobe_mutex held */ static bool kprobe_enabled; -DEFINE_MUTEX(kprobe_mutex);/* Protects kprobe_table */ +static DEFINE_MUTEX(kprobe_mutex); /* Protects kprobe_table */ DEFINE_SPINLOCK(kretprobe_lock); /* Protects kretprobe_inst_table */ static DEFINE_PER_CPU(struct kprobe *, kprobe_instance) = NULL; -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 06/24] Text Edit Lock - Alternative code for x86
Fix a memcpy that should be a text_poke (in apply_alternatives). Use kernel_wp_save/kernel_wp_restore in text_poke to support DEBUG_RODATA correctly and so the CPU HOTPLUG special case can be removed. Add text_poke_early, for alternatives and paravirt boot-time and module load time patching. Notes: - A macro is used instead of an inline function to deal with circular header include otherwise necessary for read_cr0 and preempt_disable/enable. Changelog: - Fix text_set and text_poke alignment check (mixed up bitwise and and or) - Remove text_set - Use the new macro INIT_ARRAY() to stop polluting the C files with ({ }) brackets (which breaks some c parsers in editors). - Export add_nops, so it can be used by others. - Remove x86 test for "wp_works_ok", it will just be ignored by the architecture if not supported. - Document text_poke_early. - Remove clflush, since it breaks some VIA architectures and is not strictly necessary. - Add kerneldoc to text_poke and text_poke_early. - Remove arg cr0 from kernel_wp_save/restore. Change the macro name for kernel_wp_disable/enable. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> CC: Andi Kleen <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] CC: Thomas Gleixner <[EMAIL PROTECTED]> CC: Ingo Molnar <[EMAIL PROTECTED]> CC: H. Peter Anvin <[EMAIL PROTECTED]> --- arch/x86/kernel/alternative.c| 56 --- include/asm-x86/alternative_32.h | 36 - include/asm-x86/alternative_64.h | 36 - 3 files changed, 116 insertions(+), 12 deletions(-) Index: linux-2.6-lttng/arch/x86/kernel/alternative.c === --- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c 2007-12-06 10:08:58.0 -0500 +++ linux-2.6-lttng/arch/x86/kernel/alternative.c 2007-12-06 10:08:58.0 -0500 @@ -173,7 +173,7 @@ static const unsigned char*const * find_ #endif /* CONFIG_X86_64 */ /* Use this to add nops to a buffer, then text_poke the whole buffer. */ -static void add_nops(void *insns, unsigned int len) +void add_nops(void *insns, unsigned int len) { const unsigned char *const *noptable = find_nop_table(); @@ -186,6 +186,7 @@ static void add_nops(void *insns, unsign len -= noplen; } } +EXPORT_SYMBOL_GPL(add_nops); extern struct alt_instr __alt_instructions[], __alt_instructions_end[]; extern u8 *__smp_locks[], *__smp_locks_end[]; @@ -219,7 +220,7 @@ void apply_alternatives(struct alt_instr memcpy(insnbuf, a->replacement, a->replacementlen); add_nops(insnbuf + a->replacementlen, a->instrlen - a->replacementlen); - text_poke(instr, insnbuf, a->instrlen); + text_poke_early(instr, insnbuf, a->instrlen); } } @@ -234,7 +235,8 @@ static void alternatives_smp_lock(u8 **s continue; if (*ptr > text_end) continue; - text_poke(*ptr, ((unsigned char []){0xf0}), 1); /* add lock prefix */ + /* add lock prefix */ + text_poke(*ptr, INIT_ARRAY(unsigned char, 0xf0, 1), 1); }; } @@ -397,7 +399,7 @@ void apply_paravirt(struct paravirt_patc /* Pad the rest with nops */ add_nops(insnbuf + used, p->len - used); - text_poke(p->instr, insnbuf, p->len); + text_poke_early(p->instr, insnbuf, p->len); } } extern struct paravirt_patch_site __start_parainstructions[], @@ -457,18 +459,52 @@ void __init alternative_instructions(voi #endif } -/* - * Warning: +/** + * text_poke_early - Update instructions on a live kernel at boot time + * @addr: address to modify + * @opcode: source of the copy + * @len: length to copy + * * When you use this code to patch more than one byte of an instruction * you need to make sure that other CPUs cannot execute this code in parallel. - * Also no thread must be currently preempted in the middle of these instructions. - * And on the local CPU you need to be protected again NMI or MCE handlers - * seeing an inconsistent instruction while you patch. + * Also no thread must be currently preempted in the middle of these + * instructions. And on the local CPU you need to be protected again NMI or MCE + * handlers seeing an inconsistent instruction while you patch. + * Warning: read_cr0 is modified by paravirt, this is why we have _early + * versions. They are not in the __init section because they can be used at + * module load time. */ -void __kprobes text_poke(void *addr, unsigned char *opcode, int len) +void *text_poke_early(void *addr, const void *opcode, size_t len) { memcpy(addr, opcode, len); sync_core(); /* Could also do a CLFLUSH here to speed up CPU recovery; but that causes hangs on some VIA CPUs. */ + return addr; } + +/** + * text_poke -
[patch 00/24] Markers use immediate values, for 2.6.24-rc5-mm1
Hi Andrew, Here are the patches that would be interesting to queue for 2.6.25. As you asked, the patchset applies to 2.6.24-rc5-mm1. It includes those logical changes and applies in the following order. Thanks, Mathieu #Text Edit Lock kprobes-use-mutex-for-insn-pages.patch kprobes-dont-use-kprobes-mutex-in-arch-code.patch kprobes-declare-kprobes-mutex-static.patch declare-array.patch text-edit-lock-architecture-independent-code.patch text-edit-lock-alternative-i386-and-x86_64.patch text-edit-lock-kprobes-architecture-independent.patch text-edit-lock-kprobes-i386.patch text-edit-lock-kprobes-x86_64.patch text-edit-lock-i386-standardize-debug-rodata.patch text-edit-lock-x86_64-standardize-debug-rodata.patch # #Immediate Values immediate-values-architecture-independent-code.patch immediate-values-kconfig-embedded.patch immediate-values-x86-optimization.patch add-text-poke-to-powerpc.patch immediate-values-powerpc-optimization.patch immediate-values-documentation.patch # profiling-use-immediate-values.patch # #Markers use immediate values immediate-values-move-kprobes-x86-restore-interrupt-to-kdebug-h.patch add-discard-section-to-x86.patch immediate-values-x86-optimization-nmi-mce-support.patch immediate-values-powerpc-optimization-nmi-mce-support.patch immediate-values-use-arch-nmi-mce-support.patch linux-kernel-markers-immediate-values.patch -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 09/24] Text Edit Lock - kprobes x86_64
Make kprobes use INIT_ARRAY(). Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> Tested-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> CC: Andi Kleen <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: Thomas Gleixner <[EMAIL PROTECTED]> CC: Ingo Molnar <[EMAIL PROTECTED]> CC: H. Peter Anvin <[EMAIL PROTECTED]> --- arch/x86/kernel/kprobes_64.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) Index: linux-2.6-lttng/arch/x86/kernel/kprobes_64.c === --- linux-2.6-lttng.orig/arch/x86/kernel/kprobes_64.c 2007-11-13 09:45:35.0 -0500 +++ linux-2.6-lttng/arch/x86/kernel/kprobes_64.c2007-11-13 09:45:46.0 -0500 @@ -215,12 +215,13 @@ static void __kprobes arch_copy_kprobe(s void __kprobes arch_arm_kprobe(struct kprobe *p) { - text_poke(p->addr, ((unsigned char []){BREAKPOINT_INSTRUCTION}), 1); + text_poke(p->addr, INIT_ARRAY(unsigned char, BREAKPOINT_INSTRUCTION, 1), + 1); } void __kprobes arch_disarm_kprobe(struct kprobe *p) { - text_poke(p->addr, >opcode, 1); + text_poke(p->addr, INIT_ARRAY(unsigned char, p->opcode, 1), 1); } void __kprobes arch_remove_kprobe(struct kprobe *p) -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 21/24] Immediate Values - x86 Optimization NMI and MCE support
x86 optimization of the immediate values which uses a movl with code patching to set/unset the value used to populate the register used as variable source. It uses a breakpoint to bypass the instruction being changed, which lessens the interrupt latency of the operation and protects against NMIs and MCE. Changelog: - Use text_poke_early with cr0 WP save/restore to patch the bypass. We are doing non atomic writes to a code region only touched by us (nobody can execute it since we are protected by the imv_mutex). - Add x86_64 support, ready for i386+x86_64 -> x86 merge. - Use asm-x86/asm.h. - Change the immediate.c update code to support variable length opcodes. - Use imv_* instead of immediate_*. - Use kernel_wp_disable/enable instead of save/restore. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> CC: Andi Kleen <[EMAIL PROTECTED]> CC: "H. Peter Anvin" <[EMAIL PROTECTED]> CC: Chuck Ebbert <[EMAIL PROTECTED]> CC: Christoph Hellwig <[EMAIL PROTECTED]> CC: Jeremy Fitzhardinge <[EMAIL PROTECTED]> CC: Thomas Gleixner <[EMAIL PROTECTED]> CC: Ingo Molnar <[EMAIL PROTECTED]> --- arch/x86/kernel/Makefile_32 |1 arch/x86/kernel/Makefile_64 |1 arch/x86/kernel/immediate.c | 277 arch/x86/kernel/traps_32.c | 10 - include/asm-x86/immediate.h | 42 +- 5 files changed, 322 insertions(+), 9 deletions(-) Index: linux-2.6-lttng/include/asm-x86/immediate.h === --- linux-2.6-lttng.orig/include/asm-x86/immediate.h2007-12-06 09:41:58.0 -0500 +++ linux-2.6-lttng/include/asm-x86/immediate.h 2007-12-06 09:42:29.0 -0500 @@ -12,6 +12,18 @@ #include +struct __imv { + unsigned long var; /* Pointer to the identifier variable of the +* immediate value +*/ + unsigned long imv; /* +* Pointer to the memory location of the +* immediate value within the instruction. +*/ + unsigned char size; /* Type size. */ + unsigned char insn_size;/* Type size. */ +} __attribute__ ((packed)); + /** * imv_read - read immediate variable * @name: immediate value name @@ -26,6 +38,11 @@ * what will generate an instruction with 8 bytes immediate value (not the REX.W * prefixed one that loads a sign extended 32 bits immediate value in a r64 * register). + * + * Create the instruction in a discarded section to calculate its size. This is + * how we can align the beginning of the instruction on an address that will + * permit atomic modification of the immediate value without knowing the size of + * the opcode used by the compiler. The operand size is known in advance. */ #define imv_read(name) \ ({ \ @@ -35,8 +52,9 @@ case 1: \ asm(".section __imv,\"a\",@progbits\n\t"\ _ASM_PTR "%c1, (3f)-%c2\n\t"\ - ".byte %c2\n\t" \ + ".byte %c2, (3f-2f)\n\t"\ ".previous\n\t" \ + "2:\n\t"\ "mov $0,%0\n\t" \ "3:\n\t"\ : "=q" (value) \ @@ -45,10 +63,16 @@ break; \ case 2: \ case 4: \ - asm(".section __imv,\"a\",@progbits\n\t"\ + asm(".section __discard,\"\",@progbits\n\t" \ + "1:\n\t"\ + "mov $0,%0\n\t" \ + "2:\n\t"\ + ".previous\n\t" \ + ".section __imv,\"a\",@progbits\n\t"\ _ASM_PTR "%c1, (3f)-%c2\n\t"\ - ".byte %c2\n\t" \ + ".byte %c2, (2b-1b)\n\t"\ ".previous\n\t" \ + ".org . + ((-.-(2b-1b)) & (%c2-1)), 0x90\n\t" \ "mov $0,%0\n\t" \
[patch 20/24] Add __discard section to x86
Add a __discard sectionto the linker script. Code produced in this section will not be put in the vmlinux file. This is useful when we have to calculate the size of an instruction before actually declaring it (for alignment purposes for instance). This is used by the immediate values. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> Acked-by: H. Peter Anvin <[EMAIL PROTECTED]> CC: Andi Kleen <[EMAIL PROTECTED]> CC: Chuck Ebbert <[EMAIL PROTECTED]> CC: Christoph Hellwig <[EMAIL PROTECTED]> CC: Jeremy Fitzhardinge <[EMAIL PROTECTED]> CC: Thomas Gleixner <[EMAIL PROTECTED]> CC: Ingo Molnar <[EMAIL PROTECTED]> --- arch/x86/kernel/vmlinux_32.lds.S |1 + arch/x86/kernel/vmlinux_64.lds.S |1 + 2 files changed, 2 insertions(+) Index: linux-2.6-lttng/arch/x86/kernel/vmlinux_32.lds.S === --- linux-2.6-lttng.orig/arch/x86/kernel/vmlinux_32.lds.S 2007-11-14 14:10:43.0 -0500 +++ linux-2.6-lttng/arch/x86/kernel/vmlinux_32.lds.S2007-11-14 14:11:32.0 -0500 @@ -205,6 +205,7 @@ SECTIONS /* Sections to be discarded */ /DISCARD/ : { *(.exitcall.exit) + *(__discard) } STABS_DEBUG Index: linux-2.6-lttng/arch/x86/kernel/vmlinux_64.lds.S === --- linux-2.6-lttng.orig/arch/x86/kernel/vmlinux_64.lds.S 2007-11-14 14:10:46.0 -0500 +++ linux-2.6-lttng/arch/x86/kernel/vmlinux_64.lds.S2007-11-14 14:11:48.0 -0500 @@ -227,6 +227,7 @@ SECTIONS /DISCARD/ : { *(.exitcall.exit) *(.eh_frame) + *(__discard) } STABS_DEBUG -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 13/24] Immediate Values - Kconfig menu in EMBEDDED
Immediate values provide a way to use dynamic code patching to update variables sitting within the instruction stream. It saves caches lines normally used by static read mostly variables. Enable it by default, but let users disable it through the EMBEDDED menu with the "Disable immediate values" submenu entry. Note: Since I think that I really should let embedded systems developers using RO memory the option to disable the immediate values, I choose to leave this menu option there, in the EMBEDDED menu. Also, the "CONFIG_IMMEDIATE" makes sense because we want to compile out all the immediate code when we decide not to use optimized immediate values at all (it removes otherwise unused code). Changelog: - Change ARCH_SUPPORTS_IMMEDIATE for ARCH_HAS_IMMEDIATE Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> CC: Rusty Russell <[EMAIL PROTECTED]> CC: Adrian Bunk <[EMAIL PROTECTED]> CC: Andi Kleen <[EMAIL PROTECTED]> CC: Alexey Dobriyan <[EMAIL PROTECTED]> CC: Christoph Hellwig <[EMAIL PROTECTED]> --- init/Kconfig | 24 1 file changed, 24 insertions(+) Index: linux-2.6-lttng/init/Kconfig === --- linux-2.6-lttng.orig/init/Kconfig 2007-12-05 20:53:19.0 -0500 +++ linux-2.6-lttng/init/Kconfig2007-12-05 20:53:35.0 -0500 @@ -435,6 +435,20 @@ config CC_OPTIMIZE_FOR_SIZE config SYSCTL bool +config IMMEDIATE + default y if !DISABLE_IMMEDIATE + depends on HAVE_IMMEDIATE + bool + help + Immediate values are used as read-mostly variables that are rarely + updated. They use code patching to modify the values inscribed in the + instruction stream. It provides a way to save precious cache lines + that would otherwise have to be used by these variables. They can be + disabled through the EMBEDDED menu. + +config HAVE_IMMEDIATE + def_bool n + menuconfig EMBEDDED bool "Configure standard kernel features (for small systems)" help @@ -670,6 +684,16 @@ config MARKERS source "arch/Kconfig" +config DISABLE_IMMEDIATE + default y if EMBEDDED + bool "Disable immediate values" if EMBEDDED + depends on HAVE_IMMEDIATE + help + Disable code patching based immediate values for embedded systems. It + consumes slightly more memory and requires to modify the instruction + stream each time a variable is updated. Should really be disabled for + embedded systems with read-only text. + endmenu# General setup config RT_MUTEXES -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 11/24] Text Edit Lock - x86_64 standardize debug rodata
Standardize DEBUG_RODATA, removing special cases for hotplug and kprobes. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> CC: Andi Kleen <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] CC: Thomas Gleixner <[EMAIL PROTECTED]> CC: Ingo Molnar <[EMAIL PROTECTED]> CC: H. Peter Anvin <[EMAIL PROTECTED]> --- arch/x86_64/mm/init.c | 23 +-- 1 file changed, 5 insertions(+), 18 deletions(-) Index: linux-2.6-lttng/arch/x86/mm/init_64.c === --- linux-2.6-lttng.orig/arch/x86/mm/init_64.c 2007-09-24 11:00:01.0 -0400 +++ linux-2.6-lttng/arch/x86/mm/init_64.c 2007-09-24 11:00:02.0 -0400 @@ -592,25 +592,11 @@ void free_initmem(void) void mark_rodata_ro(void) { - unsigned long start = (unsigned long)_stext, end; + unsigned long start = PFN_ALIGN(_stext); + unsigned long end = PFN_ALIGN(__end_rodata); -#ifdef CONFIG_HOTPLUG_CPU - /* It must still be possible to apply SMP alternatives. */ - if (num_possible_cpus() > 1) - start = (unsigned long)_etext; -#endif - -#ifdef CONFIG_KPROBES - start = (unsigned long)__start_rodata; -#endif - - end = (unsigned long)__end_rodata; - start = (start + PAGE_SIZE - 1) & PAGE_MASK; - end &= PAGE_MASK; - if (end <= start) - return; - - change_page_attr_addr(start, (end - start) >> PAGE_SHIFT, PAGE_KERNEL_RO); + change_page_attr_addr(start, (end - start) >> PAGE_SHIFT, + PAGE_KERNEL_RO); printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n", (end - start) >> 10); @@ -623,6 +609,7 @@ void mark_rodata_ro(void) */ global_flush_tlb(); } + #endif #ifdef CONFIG_BLK_DEV_INITRD -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 01/24] Kprobes - use a mutex to protect the instruction pages list.
Protect the instruction pages list by a specific insn pages mutex, called in get_insn_slot() and free_insn_slot(). It makes sure that architectures that does not need to call arch_remove_kprobe() does not take an unneeded kprobes mutex. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> Acked-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] --- kernel/kprobes.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) Index: linux-2.6-lttng/kernel/kprobes.c === --- linux-2.6-lttng.orig/kernel/kprobes.c 2007-08-27 11:48:56.0 -0400 +++ linux-2.6-lttng/kernel/kprobes.c2007-08-27 11:48:58.0 -0400 @@ -95,6 +95,10 @@ enum kprobe_slot_state { SLOT_USED = 2, }; +/* + * Protects the kprobe_insn_pages list. Can nest into kprobe_mutex. + */ +static DEFINE_MUTEX(kprobe_insn_mutex); static struct hlist_head kprobe_insn_pages; static int kprobe_garbage_slots; static int collect_garbage_slots(void); @@ -131,7 +135,9 @@ kprobe_opcode_t __kprobes *get_insn_slot { struct kprobe_insn_page *kip; struct hlist_node *pos; + kprobe_opcode_t *ret; + mutex_lock(_insn_mutex); retry: hlist_for_each_entry(kip, pos, _insn_pages, hlist) { if (kip->nused < INSNS_PER_PAGE) { @@ -140,7 +146,8 @@ kprobe_opcode_t __kprobes *get_insn_slot if (kip->slot_used[i] == SLOT_CLEAN) { kip->slot_used[i] = SLOT_USED; kip->nused++; - return kip->insns + (i * MAX_INSN_SIZE); + ret = kip->insns + (i * MAX_INSN_SIZE); + goto end; } } /* Surprise! No unused slots. Fix kip->nused. */ @@ -154,8 +161,10 @@ kprobe_opcode_t __kprobes *get_insn_slot } /* All out of space. Need to allocate a new page. Use slot 0. */ kip = kmalloc(sizeof(struct kprobe_insn_page), GFP_KERNEL); - if (!kip) - return NULL; + if (!kip) { + ret = NULL; + goto end; + } /* * Use module_alloc so this page is within +/- 2GB of where the @@ -165,7 +174,8 @@ kprobe_opcode_t __kprobes *get_insn_slot kip->insns = module_alloc(PAGE_SIZE); if (!kip->insns) { kfree(kip); - return NULL; + ret = NULL; + goto end; } INIT_HLIST_NODE(>hlist); hlist_add_head(>hlist, _insn_pages); @@ -173,7 +183,10 @@ kprobe_opcode_t __kprobes *get_insn_slot kip->slot_used[0] = SLOT_USED; kip->nused = 1; kip->ngarbage = 0; - return kip->insns; + ret = kip->insns; +end: + mutex_unlock(_insn_mutex); + return ret; } /* Return 1 if all garbages are collected, otherwise 0. */ @@ -207,7 +220,7 @@ static int __kprobes collect_garbage_slo struct kprobe_insn_page *kip; struct hlist_node *pos, *next; - /* Ensure no-one is preepmted on the garbages */ + /* Ensure no-one is preempted on the garbages */ if (check_safety() != 0) return -EAGAIN; @@ -231,6 +244,7 @@ void __kprobes free_insn_slot(kprobe_opc struct kprobe_insn_page *kip; struct hlist_node *pos; + mutex_lock(_insn_mutex); hlist_for_each_entry(kip, pos, _insn_pages, hlist) { if (kip->insns <= slot && slot < kip->insns + (INSNS_PER_PAGE * MAX_INSN_SIZE)) { @@ -247,6 +261,7 @@ void __kprobes free_insn_slot(kprobe_opc if (dirty && ++kprobe_garbage_slots > INSNS_PER_PAGE) collect_garbage_slots(); + mutex_unlock(_insn_mutex); } #endif -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 19/24] Immediate Values - Move Kprobes x86 restore_interrupt to kdebug.h
Since the breakpoint handler is useful both to kprobes and immediate values, it makes sense to make the required restore_interrupt() available through asm-i386/kdebug.h. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> Acked-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> CC: Christoph Hellwig <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: Thomas Gleixner <[EMAIL PROTECTED]> CC: Ingo Molnar <[EMAIL PROTECTED]> CC: H. Peter Anvin <[EMAIL PROTECTED]> --- include/asm-x86/kdebug.h | 12 include/asm-x86/kprobes_32.h |9 - include/asm-x86/kprobes_64.h |9 - 3 files changed, 12 insertions(+), 18 deletions(-) Index: linux-2.6-lttng/include/asm-x86/kdebug.h === --- linux-2.6-lttng.orig/include/asm-x86/kdebug.h 2007-11-02 15:01:53.0 -0400 +++ linux-2.6-lttng/include/asm-x86/kdebug.h2007-11-02 15:02:00.0 -0400 @@ -3,6 +3,9 @@ #include +#include +#include + struct pt_regs; /* Grossly misnamed. */ @@ -30,4 +33,13 @@ extern void dump_pagetable(unsigned long extern unsigned long oops_begin(void); extern void oops_end(unsigned long); +/* trap3/1 are intr gates for kprobes. So, restore the status of IF, + * if necessary, before executing the original int3/1 (trap) handler. + */ +static inline void restore_interrupts(struct pt_regs *regs) +{ + if (regs->eflags & IF_MASK) + local_irq_enable(); +} + #endif Index: linux-2.6-lttng/include/asm-x86/kprobes_32.h === --- linux-2.6-lttng.orig/include/asm-x86/kprobes_32.h 2007-11-02 15:01:53.0 -0400 +++ linux-2.6-lttng/include/asm-x86/kprobes_32.h2007-11-02 15:02:00.0 -0400 @@ -79,15 +79,6 @@ struct kprobe_ctlblk { struct prev_kprobe prev_kprobe; }; -/* trap3/1 are intr gates for kprobes. So, restore the status of IF, - * if necessary, before executing the original int3/1 (trap) handler. - */ -static inline void restore_interrupts(struct pt_regs *regs) -{ - if (regs->eflags & IF_MASK) - local_irq_enable(); -} - extern int kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *data); extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr); Index: linux-2.6-lttng/include/asm-x86/kprobes_64.h === --- linux-2.6-lttng.orig/include/asm-x86/kprobes_64.h 2007-11-02 15:02:10.0 -0400 +++ linux-2.6-lttng/include/asm-x86/kprobes_64.h2007-11-02 15:02:22.0 -0400 @@ -72,15 +72,6 @@ struct kprobe_ctlblk { struct prev_kprobe prev_kprobe; }; -/* trap3/1 are intr gates for kprobes. So, restore the status of IF, - * if necessary, before executing the original int3/1 (trap) handler. - */ -static inline void restore_interrupts(struct pt_regs *regs) -{ - if (regs->eflags & IF_MASK) - local_irq_enable(); -} - extern int post_kprobe_handler(struct pt_regs *regs); extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr); extern int kprobe_handler(struct pt_regs *regs); -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 22/24] Immediate Values - Powerpc Optimization NMI MCE support
Use an atomic update for immediate values. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> CC: Rusty Russell <[EMAIL PROTECTED]> CC: Christoph Hellwig <[EMAIL PROTECTED]> CC: Paul Mackerras <[EMAIL PROTECTED]> --- arch/powerpc/kernel/Makefile|1 arch/powerpc/kernel/immediate.c | 73 include/asm-powerpc/immediate.h | 18 + 3 files changed, 92 insertions(+) Index: linux-2.6-lttng/arch/powerpc/kernel/immediate.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6-lttng/arch/powerpc/kernel/immediate.c 2007-12-20 20:52:27.0 -0500 @@ -0,0 +1,73 @@ +/* + * Powerpc optimized immediate values enabling/disabling. + * + * Mathieu Desnoyers <[EMAIL PROTECTED]> + */ + +#include +#include +#include +#include +#include +#include + +#define LI_OPCODE_LEN 2 + +/** + * arch_imv_update - update one immediate value + * @imv: pointer of type const struct __imv to update + * @early: early boot (1), normal (0) + * + * Update one immediate value. Must be called with imv_mutex held. + */ +int arch_imv_update(const struct __imv *imv, int early) +{ +#ifdef CONFIG_KPROBES + kprobe_opcode_t *insn; + /* +* Fail if a kprobe has been set on this instruction. +* (TODO: we could eventually do better and modify all the (possibly +* nested) kprobes for this site if kprobes had an API for this. +*/ + switch (imv->size) { + case 1: /* The uint8_t points to the 3rd byte of the +* instruction */ + insn = (void *)(imv->imv - 1 - LI_OPCODE_LEN); + break; + case 2: insn = (void *)(imv->imv - LI_OPCODE_LEN); + break; + default: + return -EINVAL; + } + + if (unlikely(!early && *insn == BREAKPOINT_INSTRUCTION)) { + printk(KERN_WARNING "Immediate value in conflict with kprobe. " + "Variable at %p, " + "instruction at %p, size %lu\n", + (void *)imv->imv, + (void *)imv->var, imv->size); + return -EBUSY; + } +#endif + + /* +* If the variable and the instruction have the same value, there is +* nothing to do. +*/ + switch (imv->size) { + case 1: if (*(uint8_t *)imv->imv + == *(uint8_t *)imv->var) + return 0; + break; + case 2: if (*(uint16_t *)imv->imv + == *(uint16_t *)imv->var) + return 0; + break; + default:return -EINVAL; + } + memcpy((void *)imv->imv, (void *)imv->var, + imv->size); + flush_icache_range(imv->imv, + imv->imv + imv->size); + return 0; +} Index: linux-2.6-lttng/include/asm-powerpc/immediate.h === --- linux-2.6-lttng.orig/include/asm-powerpc/immediate.h2007-12-20 20:52:20.0 -0500 +++ linux-2.6-lttng/include/asm-powerpc/immediate.h 2007-12-20 20:52:27.0 -0500 @@ -12,6 +12,16 @@ #include +struct __imv { + unsigned long var; /* Identifier variable of the immediate value */ + unsigned long imv; /* +* Pointer to the memory location that holds +* the immediate value within the load immediate +* instruction. +*/ + unsigned char size; /* Type size. */ +} __attribute__ ((packed)); + /** * imv_read - read immediate variable * @name: immediate value name @@ -19,6 +29,11 @@ * Reads the value of @name. * Optimized version of the immediate. * Do not use in __init and __exit functions. Use _imv_read() instead. + * Makes sure the 2 bytes update will be atomic by aligning the immediate + * value. Use a normal memory read for the 4 bytes immediate because there is no + * way to atomically update it without using a seqlock read side, which would + * cost more in term of total i-cache and d-cache space than a simple memory + * read. */ #define imv_read(name) \ ({ \ @@ -40,6 +55,7 @@ PPC_LONG "%c1, ((1f)-2)\n\t"\ ".byte 2\n\t" \ ".previous\n\t" \ + ".align 2\n\t" \ "li %0,0\n\t" \ "1:\n\t"\ :
[patch 15/24] Add text_poke and sync_core to powerpc
- Needed on architectures where we must surround live instruction modification with "WP flag disable". - Turns into a memcpy on powerpc since there is no WP flag activated for instruction pages (yet..). - Add empty sync_core to powerpc so it can be used in architecture independent code. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> CC: Rusty Russell <[EMAIL PROTECTED]> CC: Christoph Hellwig <[EMAIL PROTECTED]> CC: Paul Mackerras <[EMAIL PROTECTED]> --- include/asm-powerpc/cacheflush.h |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Index: linux-2.6-lttng/include/asm-powerpc/cacheflush.h === --- linux-2.6-lttng.orig/include/asm-powerpc/cacheflush.h 2007-11-19 12:05:50.0 -0500 +++ linux-2.6-lttng/include/asm-powerpc/cacheflush.h2007-11-19 13:27:36.0 -0500 @@ -63,7 +63,9 @@ extern void flush_dcache_phys_range(unsi #define copy_from_user_page(vma, page, vaddr, dst, src, len) \ memcpy(dst, src, len) - +#define text_poke memcpy +#define text_poke_earlytext_poke +#define sync_core() #ifdef CONFIG_DEBUG_PAGEALLOC /* internal debugging function */ -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 18/24] Scheduler Profiling - Use Immediate Values
Use immediate values with lower d-cache hit in optimized version as a condition for scheduler profiling call. Changelog : - Use imv_* instead of immediate_*. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> --- drivers/kvm/kvm_main.c |3 ++- include/linux/profile.h |5 +++-- kernel/profile.c| 22 +++--- kernel/sched_fair.c |6 +- 4 files changed, 17 insertions(+), 19 deletions(-) Index: linux-2.6-lttng/kernel/profile.c === --- linux-2.6-lttng.orig/kernel/profile.c 2007-12-05 20:50:34.0 -0500 +++ linux-2.6-lttng/kernel/profile.c2007-12-05 20:53:43.0 -0500 @@ -42,8 +42,8 @@ static int (*timer_hook)(struct pt_regs static atomic_t *prof_buffer; static unsigned long prof_len, prof_shift; -int prof_on __read_mostly; -EXPORT_SYMBOL_GPL(prof_on); +DEFINE_IMV(char, prof_on) __read_mostly; +EXPORT_IMV_SYMBOL_GPL(prof_on); static cpumask_t prof_cpu_mask = CPU_MASK_ALL; #ifdef CONFIG_SMP @@ -61,7 +61,7 @@ static int __init profile_setup(char * s if (!strncmp(str, sleepstr, strlen(sleepstr))) { #ifdef CONFIG_SCHEDSTATS - prof_on = SLEEP_PROFILING; + imv_set(prof_on, SLEEP_PROFILING); if (str[strlen(sleepstr)] == ',') str += strlen(sleepstr) + 1; if (get_option(, )) @@ -74,7 +74,7 @@ static int __init profile_setup(char * s "kernel sleep profiling requires CONFIG_SCHEDSTATS\n"); #endif /* CONFIG_SCHEDSTATS */ } else if (!strncmp(str, schedstr, strlen(schedstr))) { - prof_on = SCHED_PROFILING; + imv_set(prof_on, SCHED_PROFILING); if (str[strlen(schedstr)] == ',') str += strlen(schedstr) + 1; if (get_option(, )) @@ -83,7 +83,7 @@ static int __init profile_setup(char * s "kernel schedule profiling enabled (shift: %ld)\n", prof_shift); } else if (!strncmp(str, kvmstr, strlen(kvmstr))) { - prof_on = KVM_PROFILING; + imv_set(prof_on, KVM_PROFILING); if (str[strlen(kvmstr)] == ',') str += strlen(kvmstr) + 1; if (get_option(, )) @@ -93,7 +93,7 @@ static int __init profile_setup(char * s prof_shift); } else if (get_option(, )) { prof_shift = par; - prof_on = CPU_PROFILING; + imv_set(prof_on, CPU_PROFILING); printk(KERN_INFO "kernel profiling enabled (shift: %ld)\n", prof_shift); } @@ -104,7 +104,7 @@ __setup("profile=", profile_setup); void __init profile_init(void) { - if (!prof_on) + if (!_imv_read(prof_on)) return; /* only text is profiled */ @@ -293,7 +293,7 @@ void profile_hits(int type, void *__pc, int i, j, cpu; struct profile_hit *hits; - if (prof_on != type || !prof_buffer) + if (!prof_buffer) return; pc = min((pc - (unsigned long)_stext) >> prof_shift, prof_len - 1); i = primary = (pc & (NR_PROFILE_GRP - 1)) << PROFILE_GRPSHIFT; @@ -403,7 +403,7 @@ void profile_hits(int type, void *__pc, { unsigned long pc; - if (prof_on != type || !prof_buffer) + if (!prof_buffer) return; pc = ((unsigned long)__pc - (unsigned long)_stext) >> prof_shift; atomic_add(nr_hits, _buffer[min(pc, prof_len - 1)]); @@ -560,7 +560,7 @@ static int __init create_hash_tables(voi } return 0; out_cleanup: - prof_on = 0; + imv_set(prof_on, 0); smp_mb(); on_each_cpu(profile_nop, NULL, 0, 1); for_each_online_cpu(cpu) { @@ -587,7 +587,7 @@ static int __init create_proc_profile(vo { struct proc_dir_entry *entry; - if (!prof_on) + if (!_imv_read(prof_on)) return 0; if (create_hash_tables()) return -1; Index: linux-2.6-lttng/include/linux/profile.h === --- linux-2.6-lttng.orig/include/linux/profile.h2007-12-05 20:50:34.0 -0500 +++ linux-2.6-lttng/include/linux/profile.h 2007-12-05 20:53:43.0 -0500 @@ -7,10 +7,11 @@ #include #include #include +#include #include -extern int prof_on __read_mostly; +DECLARE_IMV(char, prof_on) __read_mostly; #define CPU_PROFILING 1 #define SCHED_PROFILING2 @@ -38,7 +39,7 @@ static inline void profile_hit(int type, /* * Speedup for the common (no profiling enabled) case: */ - if (unlikely(prof_on == type)) + if (unlikely(imv_read(prof_on) == type)) profile_hits(type, ip, 1); } Index: linux-2.6-lttng/drivers/kvm/kvm_main.c
[patch 24/24] Linux Kernel Markers - Use Immediate Values
Make markers use immediate values. Changelog : - Use imv_* instead of immediate_*. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> --- Documentation/markers.txt | 17 + include/linux/marker.h| 42 -- kernel/marker.c |8 ++-- kernel/module.c |1 + 4 files changed, 52 insertions(+), 16 deletions(-) Index: linux-2.6-lttng/include/linux/marker.h === --- linux-2.6-lttng.orig/include/linux/marker.h 2007-12-05 20:53:25.0 -0500 +++ linux-2.6-lttng/include/linux/marker.h 2007-12-05 20:53:54.0 -0500 @@ -12,6 +12,7 @@ * See the file COPYING for more details. */ +#include #include struct module; @@ -42,7 +43,7 @@ struct marker { const char *format; /* Marker format string, describing the * variable argument list. */ - char state; /* Marker state. */ + DEFINE_IMV(char, state);/* Immediate value state. */ char ptype; /* probe type : 0 : single, 1 : multi */ void (*call)(const struct marker *mdata,/* Probe wrapper */ void *call_private, const char *fmt, ...); @@ -53,13 +54,14 @@ struct marker { #ifdef CONFIG_MARKERS /* + * Generic marker flavor always available. * Note : the empty asm volatile with read constraint is used here instead of a * "used" attribute to fix a gcc 4.1.x bug. * Make sure the alignment of the structure in the __markers section will * not add unwanted padding between the beginning of the section and the * structure. Force alignment to the same alignment as the section start. */ -#define __trace_mark(name, call_private, format, args...) \ +#define __trace_mark(generic, name, call_private, format, args...) \ do {\ static const char __mstrtab_##name[]\ __attribute__((section("__markers_strings"))) \ @@ -70,17 +72,23 @@ struct marker { 0, 0, marker_probe_cb, \ { __mark_empty_function, NULL}, NULL }; \ __mark_check_format(format, ## args); \ - if (unlikely(__mark_##name.state)) {\ - (*__mark_##name.call) \ - (&__mark_##name, call_private, \ - format, ## args); \ + if (!generic) { \ + if (unlikely(imv_read(__mark_##name.state)))\ + (*__mark_##name.call) \ + (&__mark_##name, call_private, \ + format, ## args); \ + } else {\ + if (unlikely(_imv_read(__mark_##name.state))) \ + (*__mark_##name.call) \ + (&__mark_##name, call_private, \ + format, ## args); \ } \ } while (0) extern void marker_update_probe_range(struct marker *begin, struct marker *end); #else /* !CONFIG_MARKERS */ -#define __trace_mark(name, call_private, format, args...) \ +#define __trace_mark(generic, name, call_private, format, args...) \ __mark_check_format(format, ## args) static inline void marker_update_probe_range(struct marker *begin, struct marker *end) @@ -88,15 +96,29 @@ static inline void marker_update_probe_r #endif /* CONFIG_MARKERS */ /** - * trace_mark - Marker + * trace_mark - Marker using code patching * @name: marker name, not quoted. * @format: format string * @args...: variable argument list * - * Places a marker. + * Places a marker using optimized code patching technique (imv_read()) + * to be enabled. */ #define trace_mark(name, format, args...) \ - __trace_mark(name, NULL, format, ## args) + __trace_mark(0, name, NULL, format, ## args) + +/** + * _trace_mark - Marker using variable read + * @name: marker name, not quoted. + * @format: format string + * @args...: variable argument list + * + * Places a marker using a standard memory read (_imv_read()) to be + * enabled. Should be used for markers in __init and __exit functions and in + * lockdep code. + */ +#define _trace_mark(name, format, args...) \ + __trace_mark(1, name, NULL, format, ## args) /** * MARK_NOARGS - Format string for a marker with no argument. Index:
[patch 04/24] Add INIT_ARRAY() to kernel.h
Add initialization of an array, which needs brackets that would pollute kernel code, to kernel.h. It is used to declare arguments passed as function parameters such as: text_poke(addr, INIT_ARRAY(unsigned char, 0xf0, len), len); Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> --- include/linux/kernel.h |2 ++ 1 file changed, 2 insertions(+) Index: linux-2.6-lttng/include/linux/kernel.h === --- linux-2.6-lttng.orig/include/linux/kernel.h 2007-11-13 09:25:29.0 -0500 +++ linux-2.6-lttng/include/linux/kernel.h 2007-11-13 09:45:38.0 -0500 @@ -421,4 +421,6 @@ struct sysinfo { #define NUMA_BUILD 0 #endif +#define INIT_ARRAY(type, val, len) ((type [len]) { [0 ... (len)-1] = (val) }) + #endif -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 23/24] Immediate Values Use Arch NMI and MCE Support
Remove the architecture agnostic code now replaced by architecture specific, atomic instruction updates. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> --- include/linux/immediate.h | 11 kernel/immediate.c| 113 +- 2 files changed, 4 insertions(+), 120 deletions(-) Index: linux-2.6-lttng/kernel/immediate.c === --- linux-2.6-lttng.orig/kernel/immediate.c 2007-11-26 12:48:48.0 -0500 +++ linux-2.6-lttng/kernel/immediate.c 2007-11-26 13:01:15.0 -0500 @@ -19,9 +19,6 @@ #include #include #include -#include - -#include /* * Kernel ready to execute the SMP update that may depend on trap and ipi. @@ -37,111 +34,6 @@ extern const struct __imv __stop___imv[] */ static DEFINE_MUTEX(imv_mutex); -static atomic_t wait_sync; - -struct ipi_loop_data { - long value; - const struct __imv *imv; -} loop_data; - -static void ipi_busy_loop(void *arg) -{ - unsigned long flags; - - local_irq_save(flags); - atomic_dec(_sync); - do { - /* Make sure the wait_sync gets re-read */ - smp_mb(); - } while (atomic_read(_sync) > loop_data.value); - atomic_dec(_sync); - do { - /* Make sure the wait_sync gets re-read */ - smp_mb(); - } while (atomic_read(_sync) > 0); - /* -* Issuing a synchronizing instruction must be done on each CPU before -* reenabling interrupts after modifying an instruction. Required by -* Intel's errata. -*/ - sync_core(); - flush_icache_range(loop_data.imv->imv, - loop_data.imv->imv + loop_data.imv->size); - local_irq_restore(flags); -} - -/** - * apply_imv_update - update one immediate value - * @imv: pointer of type const struct __imv to update - * - * Update one immediate value. Must be called with imv_mutex held. - * It makes sure all CPUs are not executing the modified code by having them - * busy looping with interrupts disabled. - * It does _not_ protect against NMI and MCE (could be a problem with Intel's - * errata if we use immediate values in their code path). - */ -static int apply_imv_update(const struct __imv *imv) -{ - unsigned long flags; - long online_cpus; - - /* -* If the variable and the instruction have the same value, there is -* nothing to do. -*/ - switch (imv->size) { - case 1: if (*(uint8_t *)imv->imv - == *(uint8_t *)imv->var) - return 0; - break; - case 2: if (*(uint16_t *)imv->imv - == *(uint16_t *)imv->var) - return 0; - break; - case 4: if (*(uint32_t *)imv->imv - == *(uint32_t *)imv->var) - return 0; - break; - case 8: if (*(uint64_t *)imv->imv - == *(uint64_t *)imv->var) - return 0; - break; - default:return -EINVAL; - } - - if (imv_early_boot_complete) { - kernel_text_lock(); - lock_cpu_hotplug(); - online_cpus = num_online_cpus(); - atomic_set(_sync, 2 * online_cpus); - loop_data.value = online_cpus; - loop_data.imv = imv; - smp_call_function(ipi_busy_loop, NULL, 1, 0); - local_irq_save(flags); - atomic_dec(_sync); - do { - /* Make sure the wait_sync gets re-read */ - smp_mb(); - } while (atomic_read(_sync) > online_cpus); - text_poke((void *)imv->imv, (void *)imv->var, - imv->size); - /* -* Make sure the modified instruction is seen by all CPUs before -* we continue (visible to other CPUs and local interrupts). -*/ - wmb(); - atomic_dec(_sync); - flush_icache_range(imv->imv, - imv->imv + imv->size); - local_irq_restore(flags); - unlock_cpu_hotplug(); - kernel_text_unlock(); - } else - text_poke_early((void *)imv->imv, (void *)imv->var, - imv->size); - return 0; -} - /** * imv_update_range - Update immediate values in a range * @begin: pointer to the beginning of the range @@ -154,9 +46,12 @@ void imv_update_range(const struct __imv { const struct __imv *iter; int ret; + for (iter = begin; iter < end; iter++) { mutex_lock(_mutex); - ret = apply_imv_update(iter); + kernel_text_lock(); + ret = arch_imv_update(iter,
Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.
Pavel Emelyanov wrote: > Oren Laadan wrote: >> Serge E. Hallyn wrote: >>> Quoting Pavel Emelyanov ([EMAIL PROTECTED]): Oren Laadan wrote: > Serge E. Hallyn wrote: >> Quoting Oren Laadan ([EMAIL PROTECTED]): >>> I hate to bring this again, but what if the admin in the container >>> mounts an external file system (eg. nfs, usb, loop mount from a file, >>> or via fuse), and that file system already has a device that we would >>> like to ban inside that container ? >> Miklos' user mount patches enforced that if !capable(CAP_MKNOD), >> then mnt->mnt_flags |= MNT_NODEV. So that's no problem. > Yes, that works to disallow all device files from a mounted file system. > > But it's a black and white thing: either they are all banned or allowed; > you can't have some devices allowed and others not, depending on type > A scenario where this may be useful is, for instance, if we some apps in > the container to execute withing a pre-made chroot (sub)tree within that > container. > >> But that's been pulled out of -mm! ? Crap. >> >>> Since anyway we will have to keep a white- (or black-) list of devices >>> that are permitted in a container, and that list may change even change >>> per container -- why not enforce the access control at the VFS layer ? >>> It's safer in the long run. >> By that you mean more along the lines of Pavel's patch than my whitelist >> LSM, or you actually mean Tetsuo's filesystem (i assume you don't mean >> that >> by 'vfs layer' :), or something different entirely? > :) > > By 'vfs' I mean at open() time, and not at mount(), or mknod() time. > Either yours or Pavel's; I tend to prefer not to use LSM as it may > collide with future security modules. Oren, AFAIS you've seen my patches for device access controller, right? >> If you mean this one: >> http://openvz.org/pipermail/devel/2007-September/007647.html >> then ack :) > > Great! Thanks. > Maybe we can revisit the issue then and try to come to agreement on what kind of model and implementation we all want? >>> That would be great, Pavel. I do prefer your solution over my LSM, so >>> if we can get an elegant block device control right in the vfs code that >>> would be my preference. >> I concur. >> >> So it seems to me that we are all in favor of the model where open() >> of a device will consult a black/white-list. Also, we are all in favor >> of a non-LSM implementation, Pavel's code being a good example. > > Thank you, Oren and Serge! I will revisit this issue then, but > I have a vacation the next week and, after this, we have a New > Year and Christmas holidays in Russia. So I will be able to go > on with it only after the 7th January :( Hope this is OK for you. > > Besides, Andrew told that he would pay little attention to new > features till the 2.6.24 release, so I'm afraid we won't have this > even in -mm in the nearest months :( Sounds great ! (as for the delay, it wasn't the highest priority issue to begin with, so no worries). Ah.. coincidentally they are celebrated here, too, on the same time :D Merry Christmas and Happy New Year ! Oren. > > Thanks, > Pavel > >> Oren. >> >>> The only thing that makes me keep wanting to go back to an LSM is the >>> fact that the code defining the whitelist seems out of place in the vfs. >>> But I guess that's actually separated into a modular cgroup, with the >>> actual enforcement built in at the vfs. So that's really the best >>> solution. >>> >>> -serge > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Trailing periods in kernel messages
On Thursday 20 December 2007, Alan Cox wrote: > The kernel printk messages are sentences. I'm afraid that I completely and utterly disagree. Kernel messages are _not_ sentences. The vast majority is not well-formed and does not contain any of the elements that are required for a proper sentence. The most kernel messages can be compared to is a rather diverse and sloppy enumeration. And enumerations follow completely different rules than sentences. It can better be characterized as a "semi-random sequence of context-sensitive technical messages". IMHO the existing rule that "Kernel messages do not have to be terminated with a period." is completely justified, though it does need some minor clarification on the cases in which proper punctuation _should_ be followed. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
Robert Hancock wrote: First off, I would like to see confirmation from the horses's mouths here (namely AMD, ServerWorks/Broadcom, and whoever else) that there is no other way to get around this problem than disabling MMCONFIG for accesses behind those chips. And here are the excerpts from that page of the spec which are salient to the present discussion: -- The base configuration space of the AMD-8132 and PCI(-X) devices attached to it are accessible using only the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices attached to the AMD-8132 in the extended configuration space are not accessible. The AMD-8132 has no registers in the extended configuration space. Fix Planned No -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/