Some Serial Wacom Tablet devices failing to return from Hibernate/Suspend

2007-12-20 Thread Michael Heath
A lot of Ubuntu users have noticed troubles with Wacom Serial Tablet
devices (mainly builtin units in Toshiba tablet PCs) refusing to
properly return from ACPI Suspend or Hibernate. See the bug report at:
https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.22/+bug/152187

Tom Jaeger there noted that this bug seems to reside in some ACPI
related IRQ handling code for such devices introduced in 2.6.21-rc4,
via a patch to fix Parallel Port IRQs on resumes. You can see his
whole bug report on the issue (including some tentative
shot-in-the-dark patches) at
http://bugzilla.kernel.org/show_bug.cgi?id=9487

This seems to be affecting a fair amount of users, and all of us
active on the bug in the Ubuntu tracker aren't familiar enough with
the ACPI subsystem to create a correct patch for this issue. Is anyone
familiar enough with this area that they can provide some guidance on
the correct way to fix this issue, beyond the quick fix Tom Jaeger
posted?

Michael Heath
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] constify tables in kernel/sysctl_check.c

2007-12-20 Thread Jan Beulich
>>> Eric W. Biederman <[EMAIL PROTECTED]> 21.12.07 00:05 >>>
>"Jan Beulich" <[EMAIL PROTECTED]> writes:
>
>> Remains the question whether it is intended that many, perhaps even
>> large, tables are compiled in without ever having a chance to get used,
>> i.e. whether there shouldn't #ifdef CONFIG_xxx get added.
>
>
>The constification looks good.  The file should be compiled only when
>we have sysctl support.  We use those tables when we call 
>register_sysctl_table.  Which we do a lot.

I understand this. Nevertheless, the tables take 23k on 64-bits, and many
of them are unused when certain subsystems aren't being built (and some
are even architecture specific). The arlan tables are a particularly good
example, but the netfilter ones are pretty big and probably not always
used, too.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: almost daily Kernel oops with 2.6.23.9 - and now 2.6.23.11 as well

2007-12-20 Thread Mike Galbraith

On Thu, 2007-12-20 at 19:14 +0100, Hemmann, Volker Armin wrote:

> It is just.. I could be the hardware - but I should have seen the 
> same 'problem' with earlier kernels - and the 'almost daily oops' only 
> started with 2.6.23.

Nonetheless, the oopsen _suggest_ hardware.  If it were my box, I'd move
ram modules as a first step.  It costs about two minutes to eliminate
that possibility, but you seem reluctant to take that step.  Heck, I'd
_hope_ it's something as simple bad ram, because otherwise, quest for
stability could become a time consuming and/or expensive undertaking...

If that didn't change anything, I'd go back and stress test a previously
stable configuration to gain confidence in my hardware.  If 'uhoh, not
as stable as I thought' happened, and nothing is getting obviously hot
[1], I'd pray that it's an electrically noisy power supply, because
that's also easy and cheap.  In any case, once I was very very confident
that my hardware was indeed sound, I'd move on to an agonizingly tedious
bisection, with no out of tree modules ever loaded, to narrow down when
this memory corruption that nobody else appears to be hitting appeared.

-Mike

1.  Crappy heatsink compound can dry out and fracture, leaving hot chip
under a relatively cool heatsink.  This is exactly what I found when I
disassembled my suddenly unstable under heavy load P4 box a while back.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3 -mm] kexec jump -v8 : access memory image of kexec_image

2007-12-20 Thread Huang, Ying
This patch adds a file in proc file system to access the loaded
kexec_image, which may contains the memory image of kexeced
system. This can be used to:

- Communicate between original kernel and kexeced kernel through write
  to some pages in original kernel.

- Communicate between original kernel and kexeced kernel through read
  memory image of kexeced kernel, amend the image, and reload the
  amended image.

- Accelerate boot of kexeced kernel. If you have a memory image of
  kexeced kernel, you need not a normal boot process to jump to the
  kexeced kernel, just load the memory image, jump to the point where
  you leave last time in kexeced kernel.

Signed-off-by: Huang Ying <[EMAIL PROTECTED]>

---
 fs/proc/Makefile  |1 
 fs/proc/kimgcore.c|  277 ++
 fs/proc/proc_misc.c   |6 +
 include/linux/kexec.h |7 +
 kernel/kexec.c|5 
 5 files changed, 291 insertions(+), 5 deletions(-)

--- /dev/null
+++ b/fs/proc/kimgcore.c
@@ -0,0 +1,277 @@
+/*
+ * fs/proc/kimgcore.c - Interface for accessing the loaded
+ * kexec_image, which may contains the memory image of kexeced system.
+ * Heavily borrowed from fs/proc/kcore.c
+ *
+ * Copyright (C) 2007, Intel Corp.
+ *  Huang Ying <[EMAIL PROTECTED]>
+ *
+ * This file is released under the GPLv2
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct proc_dir_entry *proc_root_kimgcore;
+
+static u32 kimgcore_size;
+
+static char *elfcorebuf;
+static size_t elfcorebuf_sz;
+
+static void *buf_page;
+
+static ssize_t kimage_copy_to_user(struct kimage *image, char __user *buf,
+  unsigned long offset, size_t count)
+{
+   kimage_entry_t *ptr, entry;
+   unsigned long off = 0, offinp, trunk;
+   struct page *page;
+   void *vaddr;
+
+   for_each_kimage_entry(image, ptr, entry) {
+   if (!(entry & IND_SOURCE))
+   continue;
+   if (off + PAGE_SIZE > offset) {
+   offinp = offset - off;
+   if (count > PAGE_SIZE - offinp)
+   trunk = PAGE_SIZE - offinp;
+   else
+   trunk = count;
+   page = pfn_to_page(entry >> PAGE_SHIFT);
+   if (PageHighMem(page)) {
+   vaddr = kmap(page);
+   memcpy(buf_page, vaddr+offinp, trunk);
+   kunmap(page);
+   vaddr = buf_page;
+   } else
+   vaddr = __va(entry & PAGE_MASK) + offinp;
+   if (copy_to_user(buf, vaddr, trunk))
+   return -EFAULT;
+   buf += trunk;
+   offset += trunk;
+   count -= trunk;
+   if (!count)
+   break;
+   }
+   off += PAGE_SIZE;
+   }
+   return count;
+}
+
+static ssize_t kimage_copy_from_user(struct kimage *image,
+const char __user *buf,
+unsigned long offset,
+size_t count)
+{
+   kimage_entry_t *ptr, entry;
+   unsigned long off = 0, offinp, trunk;
+   struct page *page;
+   void *vaddr;
+
+   for_each_kimage_entry(image, ptr, entry) {
+   if (!(entry & IND_SOURCE))
+   continue;
+   if (off + PAGE_SIZE > offset) {
+   offinp = offset - off;
+   if (count > PAGE_SIZE - offinp)
+   trunk = PAGE_SIZE - offinp;
+   else
+   trunk = count;
+   page = pfn_to_page(entry >> PAGE_SHIFT);
+   if (PageHighMem(page))
+   vaddr = buf_page;
+   else
+   vaddr = __va(entry & PAGE_MASK) + offinp;
+   if (copy_from_user(vaddr, buf, trunk))
+   return -EFAULT;
+   if (PageHighMem(page)) {
+   vaddr = kmap(page);
+   memcpy(vaddr+offinp, buf_page, trunk);
+   kunmap(page);
+   }
+   buf += trunk;
+   offset += trunk;
+   count -= trunk;
+   if (!count)
+   break;
+   }
+   off += PAGE_SIZE;
+   }
+   return count;
+}
+
+static ssize_t read_kimgcore(struct file *file, char __user *buffer,
+size_t buflen, loff_t *fpos)
+{
+   size_t acc = 0;
+   size_t tsz;

Re: [Jan Beulich] [PATCH] constify tables in kernel/sysctl_check.c

2007-12-20 Thread Jan Beulich
Thanks for catching this!

>>> Dave Jones <[EMAIL PROTECTED]> 21.12.07 03:30 >>>
On Thu, Dec 20, 2007 at 04:14:05PM -0700, Eric W. Biederman wrote:

 > Remains the question whether it is intended that many, perhaps even
 > large, tables are compiled in without ever having a chance to get used,
 > i.e. whether there shouldn't #ifdef CONFIG_xxx get added.

 > -static struct trans_ctl_table trans_net_ax25_param_table[] = {
 > +static const struct trans_ctl_table trans_net_ax25_table[] = {

we lost the _param, which will cause a duplicate definition with ..
 
 > -static struct trans_ctl_table trans_net_ax25_table[] = {
 > +static const struct trans_ctl_table trans_net_ax25_table[] = {

cut-n-paste thinko ?

Dave

-- 
http://www.codemonkey.org.uk

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3 -mm] kexec jump -v8 : kexec jump basic

2007-12-20 Thread Huang, Ying
This patch implements the functionality of jumping between the kexeced
kernel and the original kernel.

To support jumping between two kernels, before jumping to (executing)
the new kernel and jumping back to the original kernel, the devices
are put into quiescent state, and the state of devices and CPU is
saved. After jumping back from kexeced kernel and jumping to the new
kernel, the state of devices and CPU are restored accordingly. The
devices/CPU state save/restore code of software suspend is called to
implement corresponding function.

To support jumping without reserving memory. One shadow backup page
(source page) is allocated for each page used by new (kexeced) kernel
(destination page). When do kexec_load, the image of new kernel is
loaded into source pages, and before executing, the destination pages
and the source pages are swapped, so the contents of destination pages
are backupped. Before jumping to the new (kexeced) kernel and after
jumping back to the original kernel, the destination pages and the
source pages are swapped too.

A jump back protocol for kexec is defined and documented. It is an
extension to ordinary function calling protocol. So, the facility
provided by this patch can be used to call ordinary C function in
physical mode.

A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to
indicate that the loaded kernel image is used for jumping back.

Signed-off-by: Huang Ying <[EMAIL PROTECTED]>

---
 Documentation/i386/jump_back_protocol.txt |   66 ++
 arch/powerpc/kernel/machine_kexec.c   |2 
 arch/ppc/kernel/machine_kexec.c   |2 
 arch/sh/kernel/machine_kexec.c|2 
 arch/x86/kernel/machine_kexec_32.c|   39 +-
 arch/x86/kernel/machine_kexec_64.c|2 
 arch/x86/kernel/relocate_kernel_32.S  |  194 ++
 include/asm-x86/kexec_32.h|   34 -
 include/linux/kexec.h |   14 +-
 kernel/kexec.c|   65 +-
 kernel/power/Kconfig  |2 
 kernel/sys.c  |   35 +++--
 12 files changed, 403 insertions(+), 54 deletions(-)

--- a/arch/x86/kernel/machine_kexec_32.c
+++ b/arch/x86/kernel/machine_kexec_32.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define PAGE_ALIGNED __attribute__ ((__aligned__(PAGE_SIZE)))
 static u32 kexec_pgd[1024] PAGE_ALIGNED;
@@ -83,10 +84,14 @@ static void load_segments(void)
  * reboot code buffer to allow us to avoid allocations
  * later.
  *
- * Currently nothing.
+ * Turn off NX bit for control page.
  */
 int machine_kexec_prepare(struct kimage *image)
 {
+   if (nx_enabled) {
+   change_page_attr(image->control_code_page, 1, PAGE_KERNEL_EXEC);
+   global_flush_tlb();
+   }
return 0;
 }
 
@@ -96,25 +101,45 @@ int machine_kexec_prepare(struct kimage 
  */
 void machine_kexec_cleanup(struct kimage *image)
 {
+   if (nx_enabled) {
+   change_page_attr(image->control_code_page, 1, PAGE_KERNEL);
+   global_flush_tlb();
+   }
 }
 
 /*
  * Do not allocate memory (or fail in any way) in machine_kexec().
  * We are past the point of no return, committed to rebooting now.
  */
-NORET_TYPE void machine_kexec(struct kimage *image)
+void machine_kexec(struct kimage *image)
 {
unsigned long page_list[PAGES_NR];
void *control_page;
+   asmlinkage NORET_TYPE void
+   (*relocate_kernel_ptr)(unsigned long indirection_page,
+  unsigned long control_page,
+  unsigned long start_address,
+  unsigned int has_pae) ATTRIB_NORET;
 
/* Interrupts aren't acceptable while we reboot */
local_irq_disable();
 
control_page = page_address(image->control_code_page);
-   memcpy(control_page, relocate_kernel, PAGE_SIZE);
+   memcpy(control_page, relocate_page, PAGE_SIZE/2);
+   KJUMP_MAGIC(control_page) = 0;
 
+   if (image->preserve_context) {
+   KJUMP_MAGIC(control_page) = KJUMP_MAGIC_NUMBER;
+   if (kexec_jump_save_cpu(control_page)) {
+   image->start = KJUMP_ENTRY(control_page);
+   return;
+   }
+   }
+
+   relocate_kernel_ptr = control_page +
+   ((void *)relocate_kernel - (void *)relocate_page);
page_list[PA_CONTROL_PAGE] = __pa(control_page);
-   page_list[VA_CONTROL_PAGE] = (unsigned long)relocate_kernel;
+   page_list[VA_CONTROL_PAGE] = (unsigned long)control_page;
page_list[PA_PGD] = __pa(kexec_pgd);
page_list[VA_PGD] = (unsigned long)kexec_pgd;
 #ifdef CONFIG_X86_PAE
@@ -127,6 +152,7 @@ NORET_TYPE void machine_kexec(struct kim
page_list[VA_PTE_0] = (unsigned long)kexec_pte0;
page_list[PA_PTE_1] = __pa(kexec_pte1);
page_list[VA_PTE_1] = (unsigned 

[PATCH 2/3 -mm] kexec jump -v8 : add write support to oldmem device

2007-12-20 Thread Huang, Ying
This patch adds writing support for /dev/oldmem. This can be used to

- Communicate between original kernel and kexeced kernel through write
  to some pages in original kernel.

- Restore the memory contents of hibernated system in kexec based
  hibernation.

Signed-off-by: Huang Ying <[EMAIL PROTECTED]>

---
 arch/x86/kernel/crash_dump_32.c |   27 +++
 drivers/char/mem.c  |   32 
 include/linux/crash_dump.h  |2 ++
 3 files changed, 61 insertions(+)

--- a/arch/x86/kernel/crash_dump_32.c
+++ b/arch/x86/kernel/crash_dump_32.c
@@ -59,6 +59,33 @@ ssize_t copy_oldmem_page(unsigned long p
return csize;
 }
 
+ssize_t write_oldmem_page(unsigned long pfn, const char *buf,
+ size_t csize, unsigned long offset, int userbuf)
+{
+   void  *vaddr;
+
+   if (!csize)
+   return 0;
+
+   if (!userbuf) {
+   vaddr = kmap_atomic_pfn(pfn, KM_PTE0);
+   memcpy(vaddr + offset, buf, csize);
+   } else {
+   if (!kdump_buf_page) {
+   printk(KERN_WARNING "Kdump: Kdump buffer page not"
+   " allocated\n");
+   return -EFAULT;
+   }
+   if (copy_from_user(kdump_buf_page, buf, csize))
+   return -EFAULT;
+   vaddr = kmap_atomic_pfn(pfn, KM_PTE0);
+   memcpy(vaddr + offset, kdump_buf_page, csize);
+   }
+   kunmap_atomic(vaddr, KM_PTE0);
+
+   return csize;
+}
+
 static int __init kdump_buf_page_init(void)
 {
int ret = 0;
--- a/include/linux/crash_dump.h
+++ b/include/linux/crash_dump.h
@@ -11,6 +11,8 @@
 extern unsigned long long elfcorehdr_addr;
 extern ssize_t copy_oldmem_page(unsigned long, char *, size_t,
unsigned long, int);
+extern ssize_t write_oldmem_page(unsigned long, const char *, size_t,
+unsigned long, int);
 extern const struct file_operations proc_vmcore_operations;
 extern struct proc_dir_entry *proc_vmcore;
 
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -348,6 +348,37 @@ static ssize_t read_oldmem(struct file *
}
return read;
 }
+
+/*
+ * Write memory corresponding to the old kernel.
+ */
+static ssize_t write_oldmem(struct file *file, const char __user *buf,
+   size_t count, loff_t *ppos)
+{
+   unsigned long pfn, offset;
+   size_t write = 0, csize;
+   int rc = 0;
+
+   while (count) {
+   pfn = *ppos / PAGE_SIZE;
+   if (pfn > saved_max_pfn)
+   return write;
+
+   offset = (unsigned long)(*ppos % PAGE_SIZE);
+   if (count > PAGE_SIZE - offset)
+   csize = PAGE_SIZE - offset;
+   else
+   csize = count;
+   rc = write_oldmem_page(pfn, buf, csize, offset, 1);
+   if (rc < 0)
+   return rc;
+   buf += csize;
+   *ppos += csize;
+   write += csize;
+   count -= csize;
+   }
+   return write;
+}
 #endif
 
 extern long vread(char *buf, char *addr, unsigned long count);
@@ -783,6 +814,7 @@ static const struct file_operations full
 #ifdef CONFIG_CRASH_DUMP
 static const struct file_operations oldmem_fops = {
.read   = read_oldmem,
+   .write  = write_oldmem,
.open   = open_oldmem,
 };
 #endif

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3 -mm] kexec jump -v8

2007-12-20 Thread Huang, Ying
This patchset provides an enhancement to kexec/kdump. It implements
the following features:

- Backup/restore memory used both by the original kernel and the
  kexeced kernel.

- Jumping between the original kernel and the kexeced kernel.

- Read/write memory image of the kexeced kernel in the original kernel
  and write memory image of the original kernel in the kexeced
  kernel. This can be used as a communication method between the
  kexeced kernel and the original kernel.


The features of this patchset can be used as follow:

- Kernel/system debug through making system snapshot. You can make
  system snapshot, jump back, do some thing and make another system
  snapshot.

- A simple hibernation implementation without ACPI support. You can
  kexec a hibernating kernel, save the memory image of original system
  and shutdown the system. When resuming, you boot a resuming kernel
  in memory range of kexeced kernel, restore the memory image of
  original system and jump back.

- Cooperative multi-kernel/system. With kexec jump, you can switch
  between several kernels/systems quickly without boot process except
  the first time. This appears like swap a whole kernel/system out/in.

- A general method to call program in physical mode. This can be used
  to invoke some BIOS code under Linux.

- The basis of a full kexec based hibernation implementation with ACPI
  support. The full kexec based hibernation implementation is provided
  in another patchset named kexec based hibernation.


Now, only the i386 architecture is supported. The patchset is based on
Linux kernel 2.6.24-rc5-mm1, and has been tested on IBM T42 with ACPI
on and off.


The following user-space tools can be used with kexec jump.

1. kexec-tools needs to be patched to support kexec jump. The patches
   and the precompiled kexec can be download from the following URL:
   source: 
http://khibernation.sourceforge.net/download/release_v8/kexec-tools/kexec-tools-src_git_kh8.tar.bz2
   patches: 
http://khibernation.sourceforge.net/download/release_v8/kexec-tools/kexec-tools-patches_git_kh8.tar.bz2
   binary: 
http://khibernation.sourceforge.net/download/release_v8/kexec-tools/kexec_git_kh8

2. makedumpfile with patches are used as memory image saving tool, it
   can exclude free pages from original kernel memory image file. The
   patches and the precompiled makedumpfile can be download from the
   following URL:
   source: 
http://khibernation.sourceforge.net/download/release_v8/makedumpfile/makedumpfile-src_cvs_kh8.tar.bz2
   patches: 
http://khibernation.sourceforge.net/download/release_v8/makedumpfile/makedumpfile-patches_cvs_kh8.tar.bz2
   binary: 
http://khibernation.sourceforge.net/download/release_v8/makedumpfile/makedumpfile_cvs_kh8

3. A simplest memory image restoring tool named "krestore" is
   implemented. It can be downloaded from the following URL:
   source: 
http://khibernation.sourceforge.net/download/release_v8/krestore/krestore-src_cvs_kh8.tar.bz2
   binary: 
http://khibernation.sourceforge.net/download/release_v8/krestore/krestore_cvs_kh8

An initramfs image can be used as the root file system of kexeced
kernel. An initramfs image built with "BuildRoot" can be downloaded
from the following URL:
initramfs image: 
http://khibernation.sourceforge.net/download/release_v8/initramfs/rootfs_cvs_kh8.gz
All user space tools above are included in the initramfs image.


Usage example of jumping between original and kexeced kernel:

1. Compile and install patched kernel with following options selected:

CONFIG_X86_32=y
CONFIG_RELOCATABLE=y
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_PM=y

2. Build an initramfs image contains kexec-tool, or download the
   pre-built initramfs image, called rootfs.gz in following text.

3. Boot kernel compiled in step 1.

4. Load kernel compiled in step 1 with /sbin/kexec. If You want to use
   "krestore" tool, the --elf64-core-headers should be specified in
   command line of /sbin/kexec. The shell command line can be as
   follow:

   /sbin/kexec --load-jump-back /boot/bzImage --mem-min=0x10
 --mem-max=0xff --elf64-core-headers --initrd=rootfs.gz

5. Boot the kexeced kernel with following shell command line:

   /sbin/kexec -e

6. The kexeced kernel will boot as normal kexec. In kexeced kernel the
   memory image of original kernel can read via /proc/vmcore or
   /dev/oldmem, and can be written via /dev/oldmem. You can
   save/restore/modify it as you want to.

7. Prepare jumping back from kexeced kernel with following shell
   command lines:

   jump_back_entry=`cat /proc/cmdline | tr ' ' '\n' | grep 
kexec_jump_back_entry | cut -d '='`
   /sbin/kexec --load-jump-back-helper=$jump_back_entry

8. Jump back to the original kernel with following shell command line:

   /sbin/kexec -e

9. Now, you are in the original kernel again. You can read/write the
   memory image of kexeced kernel via /proc/kimgcore.

10. You can jump between the original kernel and 

Re: [PATCH 0/4] add task handling notifier

2007-12-20 Thread Jan Beulich
>Yes, but why export variables? Wouldn't it be better to export 
>an API? 
>
>That simplifies the callers (they all pass "current" as task 
>and "task_notifier_list" as arguments).
>
>It also prevents exposing internal variables (notifier lists 
>ARE internal variables) to modules.
>
>What do you think?

Would be a simple change if the concept itself is generally welcome. Will
first see whether I get other comments requiring re-work.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] w1-gpio: Add GPIO w1 bus master driver

2007-12-20 Thread Ville Syrjala
Add a GPIO 1-wire bus master driver. The driver used the GPIO API to
control the wire and the GPIO pin can be specified using platform data
similar to i2c-gpio. The driver was tested with AT91SAM9260 + DS2401.

Signed-off-by: Ville Syrjala <[EMAIL PROTECTED]>
---
 Documentation/w1/masters/00-INDEX |2 +
 Documentation/w1/masters/w1-gpio  |   33 
 drivers/w1/masters/Kconfig|   10 
 drivers/w1/masters/Makefile   |1 +
 drivers/w1/masters/w1-gpio.c  |  100 +
 include/linux/w1-gpio.h   |   21 
 6 files changed, 167 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/w1/masters/w1-gpio
 create mode 100644 drivers/w1/masters/w1-gpio.c
 create mode 100644 include/linux/w1-gpio.h

diff --git a/Documentation/w1/masters/00-INDEX 
b/Documentation/w1/masters/00-INDEX
index 752613c..7b0ceaa 100644
--- a/Documentation/w1/masters/00-INDEX
+++ b/Documentation/w1/masters/00-INDEX
@@ -4,3 +4,5 @@ ds2482
- The Maxim/Dallas Semiconductor DS2482 provides 1-wire busses.
 ds2490
- The Maxim/Dallas Semiconductor DS2490 builds USB <-> W1 bridges.
+w1-gpio
+   - GPIO 1-wire bus master driver.
diff --git a/Documentation/w1/masters/w1-gpio b/Documentation/w1/masters/w1-gpio
new file mode 100644
index 000..c927139
--- /dev/null
+++ b/Documentation/w1/masters/w1-gpio
@@ -0,0 +1,33 @@
+Kernel driver w1-gpio
+=
+
+Author: Ville Syrjala <[EMAIL PROTECTED]>
+
+
+Description
+---
+
+GPIO 1-wire bus master driver. The driver uses the GPIO API to control the
+wire and the GPIO pin can be specified using platform data. The GPIO pin
+must be configured as open-drain.
+
+
+Example (mach-at91)
+---
+
+#include 
+
+static struct w1_gpio_platform_data foo_w1_gpio_pdata = {
+   .pin = AT91_PIN_PB20,
+};
+
+static struct platform_device foo_w1_device = {
+   .name   = "w1-gpio",
+   .id = -1,
+   .dev.platform_data  = _w1_gpio_pdata,
+};
+
+...
+   at91_set_GPIO_periph(foo_w1_gpio_pdata.pin, 1);
+   at91_set_multi_drive(foo_w1_gpio_pdata.pin, 1);
+   platform_device_register(_w1_device);
diff --git a/drivers/w1/masters/Kconfig b/drivers/w1/masters/Kconfig
index 8236d44..c449309 100644
--- a/drivers/w1/masters/Kconfig
+++ b/drivers/w1/masters/Kconfig
@@ -42,5 +42,15 @@ config W1_MASTER_DS1WM
  in HP iPAQ devices like h5xxx, h2200, and ASIC3-based like
  hx4700.
 
+config W1_MASTER_GPIO
+   tristate "GPIO 1-wire busmaster"
+   depends on GENERIC_GPIO
+   help
+ Say Y here if you want to communicate with your 1-wire devices using
+ GPIO pins. This driver uses the GPIO API to control the wire.
+
+ This support is also available as a module.  If so, the module
+ will be called w1-gpio.ko.
+
 endmenu
 
diff --git a/drivers/w1/masters/Makefile b/drivers/w1/masters/Makefile
index 11551b3..1420b5b 100644
--- a/drivers/w1/masters/Makefile
+++ b/drivers/w1/masters/Makefile
@@ -6,3 +6,4 @@ obj-$(CONFIG_W1_MASTER_MATROX)  += matrox_w1.o
 obj-$(CONFIG_W1_MASTER_DS2490) += ds2490.o
 obj-$(CONFIG_W1_MASTER_DS2482) += ds2482.o
 obj-$(CONFIG_W1_MASTER_DS1WM)  += ds1wm.o
+obj-$(CONFIG_W1_MASTER_GPIO)   += w1-gpio.o
diff --git a/drivers/w1/masters/w1-gpio.c b/drivers/w1/masters/w1-gpio.c
new file mode 100644
index 000..c5327df
--- /dev/null
+++ b/drivers/w1/masters/w1-gpio.c
@@ -0,0 +1,100 @@
+/*
+ * w1-gpio - GPIO w1 bus master driver
+ *
+ * Copyright (C) 2007 Ville Syrjala <[EMAIL PROTECTED]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "../w1.h"
+#include "../w1_int.h"
+
+#include 
+
+static void w1_gpio_write_bit(void *data, u8 bit)
+{
+   struct w1_gpio_platform_data *pdata = data;
+
+   gpio_set_value(pdata->pin, bit);
+}
+
+static u8 w1_gpio_read_bit(void *data)
+{
+   struct w1_gpio_platform_data *pdata = data;
+
+   return gpio_get_value(pdata->pin);
+}
+
+static int __init w1_gpio_probe(struct platform_device *pdev)
+{
+   struct w1_bus_master *master;
+   struct w1_gpio_platform_data *pdata;
+   int err;
+
+   pdata = pdev->dev.platform_data;
+   if (!pdata)
+   return -ENXIO;
+
+   master = kzalloc(sizeof *master, GFP_KERNEL);
+   if (!master)
+   return -ENOMEM;
+
+   gpio_direction_output(pdata->pin, 1);
+
+   master->data = pdata;
+   master->read_bit = _gpio_read_bit;
+   master->write_bit = _gpio_write_bit;
+
+   err = w1_add_master_device(master);
+   if (err) {
+   kfree(master);
+   return err;
+   }
+
+   platform_set_drvdata(pdev, master);
+
+   return 0;
+}
+
+static int 

Re: driver spin lock and files_lock deadlock question

2007-12-20 Thread Al Viro
On Thu, Dec 20, 2007 at 10:59:20PM -0800, Srinivas Kommu wrote:

> It seems this kind of a deadlock can happen with any kernel lock, not 
> just files_lock. What's the driver's mistake here? Is it wrong to call 
> remove_proc_entry() while holding another lock? What is the right thing 
> to do?

remove_proc_entry() is a blocking function...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm 18/43] powerpc compat_binfmt_elf

2007-12-20 Thread Christoph Hellwig
On Thu, Dec 20, 2007 at 03:58:16AM -0800, Roland McGrath wrote:
> +obj-$(CONFIG_PPC64)  += ../../../fs/compat_binfmt_elf.o

Building files from another directory is nasty.  Please add a
CONFIG_BINFMT_COMPAT_ELF so we can simply build it in fs/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm 09/43] compat_sys_ptrace

2007-12-20 Thread Christoph Hellwig
On Thu, Dec 20, 2007 at 03:55:51AM -0800, Roland McGrath wrote:
> This adds a generic definition of compat_sys_ptrace that calls
> compat_arch_ptrace, parallel to sys_ptrace/arch_ptrace.  Some
> machines needing this already define a function by that name.
> The new generic function is defined only on machines that
> put #define __ARCH_WANT_COMPAT_SYS_PTRACE into asm/ptrace.h.

Nice, we should have unified the compat ptrace code long ago.

Any chance you could make the ifdef symetric to the native ptrace
where an arch defines a symbol if it has it's own ptrace?

Also when prototyping something like this I was wondering whether we
really want a separate compat function.  Lots of the ptrace requests
mostly depend on the target processes abi, not the ptrace caller, so
maybe doing it like s390 and handle both in the same function might
actually be cleaner.  Anyway, that's probably something to worry about
later one the arch-specific compat ptrace implementations are gone.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


driver spin lock and files_lock deadlock question

2007-12-20 Thread Srinivas Kommu
I have a driver that needs to be SMP-safe. It also has some code hooking 
into the net_rx_action softirq. So it takes a spinlock and disables the 
local bottom-half around its critical sections: 
spin_lock_bh(_lock). Now, I'm facing a deadlock under a 
particular sequence involving the files_lock:


1. CPU 0 takes driver_lock and then calls remove_proc_entry() which is 
hanging at spin_lock(_lock).


2. CPU 1 was in fput() which took files_lock; the softirq comes in at 
this point and attempts to take driver_lock and hangs forever.


It seems this kind of a deadlock can happen with any kernel lock, not 
just files_lock. What's the driver's mistake here? Is it wrong to call 
remove_proc_entry() while holding another lock? What is the right thing 
to do?


This is with the 2.4 kernel, BTW.


thanks
srini
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm 01/43] user_regset header

2007-12-20 Thread Christoph Hellwig
On Thu, Dec 20, 2007 at 03:53:57AM -0800, Roland McGrath wrote:
> +/*
> + * User-mode machine state access
> + *
> + * Copyright (C) 2007 Red Hat, Inc.  All rights reserved.
> + *
> + * This copyrighted material is made available to anyone wishing to use,
> + * modify, copy, or redistribute it subject to the terms and conditions
> + * of the GNU General Public License v.2.
> + *
> + * Red Hat Author: Roland McGrath.

What's a Red Hat Author?  Sorry for the nitpicking, but why don't you
just use Author like everyone else?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OOPS: 2.6.24-rc5-mm1 -- EIP is at r_show+0x2a/0x70 -- (triggered by "cat /proc/iomem" AFTER suspend-to-disk/resume)

2007-12-20 Thread Andrew Morton
On Fri, 21 Dec 2007 00:58:19 -0500 "Miles Lane" <[EMAIL PROTECTED]> wrote:

> On Dec 20, 2007 12:32 PM, Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > On Thu, 20 Dec 2007 08:38:03 -0500 Miles Lane <[EMAIL PROTECTED]>
> > wrote:
> >
> > > On further investigation, "cat /proc/iomem" does not trigger the stack
> > > trace until after a suspend-to-disk/resume cycle has occurred.
> >
> > I still can't reproduce this.
> >
> > Could you please try this?
> >
> > - cat /proc/iomem
> > - suspend/resume
> > - do
> >
> > while read i
> > do
> > echo $i
> > sleep 1
> > done < /proc/iomem
> >
> > then, with luck, we'll be able to work out which /proc/iomem record
> > immediately precedes the corrupted one.
> >
> 
> [EMAIL PROTECTED]:~$ cat > test.sh
> while read i
> do
> echo $i
> sleep 1
> done < /proc/iomem
> ^C
> [EMAIL PROTECTED]:~$ sh test.sh
> -0009f7ff : System RAM
> 0009f800-0009 : reserved
> 000a-000b : Video RAM area
> 000c-000c7fff : Video ROM
> 000f-000f : System ROM
> 0010-7f68 : System RAM
> 0010-0039e4b7 : Kernel code
> 0039e4b8-004f0983 : Kernel data
> 00553000-007ecdfb : Kernel bss
> 7f69-7f698fff : ACPI Tables
> 7f699000-7f6f : ACPI Non-volatile Storage
> 7f70-7fff : reserved
> 8800-8bff : PCI CardBus #05
> 8c00-8fff : PCI CardBus #05
> Segmentation fault
> 
> How do I determine what comes next?
> 

By comparing it with the /proc/iomem from prior to suspending the machine.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] misc: Removal of final callers using fastcall

2007-12-20 Thread Harvey Harrison
On Thu, 2007-12-20 at 18:30 -0800, Andrew Morton wrote:
> On Wed, 12 Dec 2007 15:38:26 -0800 Harvey Harrison <[EMAIL PROTECTED]> wrote:
> 
> > Andrew, I'm not sure who is best to hit with these final dribs and
> > drabs removing fastcall.  Once all of these have hit Linus' tree
> > I will send a final patch deleting the include/linux/linkage.h
> > definitions as well as any remaining occurances.
> 
> Yes, that's a good approach, thanks.  Wait until the tree is fastcall-clean
> and then kill the definition(s).
> 
> I think I skipped rather a lot of remove-fastcall patches because a)
> suitable maintainers were cc'ed and b) I was going through a
> suicidal-over-bug-reports phase.
> 
> Please keep them coming - I've always disliked fastcall.

Once I see these have hit the main tree, I'll send patch getting any
more that have snuck in for the next rc.  After that there should be
few enough left that I can send you a small patch for the next rc
with the definition removal as well.

I'll keep on top of these.

Harvey



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OOPS: 2.6.24-rc5-mm1 -- EIP is at r_show+0x2a/0x70 -- (triggered by "cat /proc/iomem" AFTER suspend-to-disk/resume)

2007-12-20 Thread Miles Lane
Resending...  Curse GMail's HTML messages!

On Dec 21, 2007 12:58 AM, Miles Lane <[EMAIL PROTECTED]> wrote:
>
> On Dec 20, 2007 12:32 PM, Andrew Morton <[EMAIL PROTECTED]> wrote:
>
> > On Thu, 20 Dec 2007 08:38:03 -0500 Miles Lane <[EMAIL PROTECTED]> wrote:
> >
> > > On further investigation, "cat /proc/iomem" does not trigger the stack
> > > trace until after a suspend-to-disk/resume cycle has occurred.
> >
> > I still can't reproduce this.
> >
> > Could you please try this?
> >
> > - cat /proc/iomem
> > - suspend/resume
> > - do
> >
> > while read i
> > do
> > echo $i
> > sleep 1
> > done < /proc/iomem
> >
> > then, with luck, we'll be able to work out which /proc/iomem record
> > immediately precedes the corrupted one.
> >
>
> [EMAIL PROTECTED]:~$ cat > test.sh
>
> while read i
> do
> echo $i
> sleep 1
> done < /proc/iomem
> ^C
> [EMAIL PROTECTED]:~$ sh test.sh
> -0009f7ff : System RAM
> 0009f800-0009 : reserved
> 000a-000b : Video RAM area
> 000c-000c7fff : Video ROM
> 000f-000f : System ROM
> 0010-7f68 : System RAM
> 0010-0039e4b7 : Kernel code
> 0039e4b8-004f0983 : Kernel data
> 00553000-007ecdfb : Kernel bss
> 7f69-7f698fff : ACPI Tables
> 7f699000-7f6f : ACPI Non-volatile Storage
> 7f70-7fff : reserved
> 8800-8bff : PCI CardBus #05
> 8c00-8fff : PCI CardBus #05
> Segmentation fault
>
> How do I determine what comes next?
>
> Thanks,
>  Miles
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: After many hours all outbound connections get stuck in SYN_SENT

2007-12-20 Thread Jan Engelhardt

On Dec 20 2007 23:05, Ilpo Järvinen wrote:
>> 
>> Given the fact that I've had this problem for so long, over a variety
>> of networking hardware vendors and colo-facilities, this really sounds
>> good to me.  It will be challenging for me to justify a kernel core
>> dump, but a simple patch to dump the Sack data would be do-able.
>
>If your symptoms really are: SYNs leaving (if they show up in tcpdump, for 
>sure they've left TCP code already) and SYN-ACK not showing up even in 
>something as early as in tcpdump (for sure TCP side code didn't execute at 
>that point yet), there's very little change that Linux' TCP code has some 
>bug in it, only things that do something in such scenario are the SYN 
>generation and retransmitting SYNs (and those are trivially verifiable 
>from tcpdump).
>
Take a machine, put two interfaces in it, configure as bridge (br0
over eth0 and eth1 without any assigned ip addresses), put it between
end node and the cisco. tcpdump there, which should give an unbiased
view wrt. endnode/cisco. Then perhaps, also configure such a network
listening bridge on the other side of the cisco, e.g. on the link to
the internet and watch that. Compare the two tcpdumpds and see if
sack got trashed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [patch 05/24] Text Edit Lock - Architecture Independent Code

2007-12-20 Thread zhangxiliang

> > ===
> > --- linux-2.6-lttng.orig/kernel/kprobes.c   2007-12-12
> > 18:10:32.0 -0500
> > +++ linux-2.6-lttng/kernel/kprobes.c2007-12-12
> > 18:10:34.0 -0500
> > @@ -644,7 +644,9 @@ valid_p:
> > list_del_rcu(>list);
> > kfree(old_p);
> > }
> > +   mutex_lock(_mutex);
> > arch_remove_kprobe(p);
> > +   mutex_unlock(_mutex);
> > } else {
> > mutex_lock(_mutex);
> > if (p->break_handler)
>
> I think "mutex_lock" and "mutex_unlock" shoud be in architecture code.
> In "__register_kprobe" funtion, its implement
> "arch_prepare_kprobe" and
> "arch_arm_kprobe" is also depended on arch.  So the remove
> implement is not
> the same on the different architecture code.
>
> Maybe it doesn't need the mutex_lock in "arch_remove_kprobe"
> on some embeded
> system chips if linux can support the other embeded system
> chips in future.

Could we insert the "mutex_lock" and "mutex_unlock" into "free_insn_slot"
instead of architecture code?

modify as follows:

void __kprobes free_insn_slot(kprobe_opcode_t * slot, int dirty)
{
struct kprobe_insn_page *kip;
struct hlist_node *pos;

+   mutex_lock(_mutex);
hlist_for_each_entry(kip, pos, _insn_pages, hlist) {
if (kip->insns <= slot &&
slot < kip->insns + (INSNS_PER_PAGE * MAX_INSN_SIZE)) {
int i = (slot - kip->insns) / MAX_INSN_SIZE;
if (dirty) {
kip->slot_used[i] = SLOT_DIRTY;
kip->ngarbage++;
} else {
collect_one_slot(kip, i);
}
break;
}
}

if (dirty && ++kprobe_garbage_slots > INSNS_PER_PAGE)
collect_garbage_slots();
+   mutex_unlock(_mutex);
}


> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of zhangxiliang
> Sent: Friday, December 21, 2007 1:19 PM
> To: 'Mathieu Desnoyers'; [EMAIL PROTECTED]; 'Ingo
> Molnar'; linux-kernel@vger.kernel.org
> Cc: 'Andi Kleen'
> Subject: RE: [patch 05/24] Text Edit Lock - Architecture
> Independent Code
>
> hello,
>I have some questions for your patches.
>
> > Paravirt and alternatives are always done when SMP is
> > inactive, so there is no
> > need to use locks.
>
> > -#ifndef CONFIG_KPROBES
> > -#ifdef CONFIG_HOTPLUG_CPU
> > -   /* It must still be possible to apply SMP alternatives. */
> > -   if (num_possible_cpus() <= 1)
> > -#endif
> > -   {
> > -   change_page_attr(virt_to_page(start),
> > -size >> PAGE_SHIFT, PAGE_KERNEL_RX);
> > -   printk("Write protecting the kernel text:
> > %luk\n", size >> 10);
> > -   }
> > -#endif
> > +   change_page_attr(virt_to_page(start),
> > +   size >> PAGE_SHIFT, PAGE_KERNEL_RX);
> > +   printk(KERN_INFO "Write protecting the kernel text: %luk\n",
> > +   size >> 10);
> > +
>
> Why "mark_rodata_ro" doesn't consider smp instance? Maybe it
> will be appied in
> future.
>
>
> > ===
> > --- linux-2.6-lttng.orig/kernel/kprobes.c   2007-12-12
> > 18:10:32.0 -0500
> > +++ linux-2.6-lttng/kernel/kprobes.c2007-12-12
> > 18:10:34.0 -0500
> > @@ -644,7 +644,9 @@ valid_p:
> > list_del_rcu(>list);
> > kfree(old_p);
> > }
> > +   mutex_lock(_mutex);
> > arch_remove_kprobe(p);
> > +   mutex_unlock(_mutex);
> > } else {
> > mutex_lock(_mutex);
> > if (p->break_handler)
>
> I think "mutex_lock" and "mutex_unlock" shoud be in architecture code.
> In "__register_kprobe" funtion, its implement
> "arch_prepare_kprobe" and
> "arch_arm_kprobe" is also depended on arch.  So the remove
> implement is not
> the same on the different architecture code.
>
> Maybe it doesn't need the mutex_lock in "arch_remove_kprobe"
> on some embeded
> system chips if linux can support the other embeded system
> chips in future.
>
>
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of
> > Mathieu Desnoyers
> > Sent: Friday, December 21, 2007 9:55 AM
> > To: [EMAIL PROTECTED]; Ingo Molnar;
> > linux-kernel@vger.kernel.org
> > Cc: Mathieu Desnoyers; Andi Kleen
> > Subject: [patch 05/24] Text Edit Lock - Architecture
> Independent Code
> >
> > This is an architecture independant synchronization around
> kernel text
> > modifications through use of a global mutex.
> >
> > A mutex has been chosen so that kprobes, the main user of
> > this, can sleep during
> > memory allocation between the memory read of the instructions
> > it must replace
> > and the memory write of the 

i2c block read on an SMBus

2007-12-20 Thread Venkat Subbiah
I am trying to do an i2c block read using a call like 

rc = i2c_smbus_xfer(g_i2c_adp, buf[0], 0x0,
  I2C_SMBUS_READ, 0x0,
  I2C_SMBUS_I2C_BLOCK_DATA, );

and the logs show me that this hits the else part of this if condition in 
i801_block_transaction function in file  i2c-i801.c. (of kernel version 
2.6.23.11)

if (command == I2C_SMBUS_I2C_BLOCK_DATA) {
if (read_write == I2C_SMBUS_WRITE) {
/* set I2C_EN bit in configuration register */
pci_read_config_byte(I801_dev, SMBHSTCFG, );
pci_write_config_byte(I801_dev, SMBHSTCFG,
  hostc | SMBHSTCFG_I2C_EN);
} else {
dev_err(_dev->dev,
"I2C_SMBUS_I2C_BLOCK_READ not DB!\n");
return -1;
}
}

some time ago when I was doing a web search i seem to have run into a patch 
which allows doing a i2c block read on SMBus. Is there a patch for this? 
( Output from my lspci: 00:1f.3 SMBus: Intel Corporation 6300ESB SMBus 
Controller (rev 02))


Looking at the documentation for 6300ESB SMBus Controller it seems that
the only I2C read transaction supported is a block read. All the other
read transaction are SMBus type. 

Why is the i2c read block not supported in the driver? 

Thanks in advance for all the input.  Please CC me on th replies as I am not 
subscribed to the list.


Thx,
Venkat


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: iommu dma mapping alignment requirements

2007-12-20 Thread Benjamin Herrenschmidt
BTW. I need to know urgently what HW is broken by this 

Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] pci: Remove pci_enable_device_bars()

2007-12-20 Thread Benjamin Herrenschmidt

On Thu, 2007-12-20 at 21:11 -0800, Greg KH wrote:
> On Fri, Dec 21, 2007 at 03:47:28PM +1100, Benjamin Herrenschmidt wrote:
> > pci: Remove pci_enable_device_bars() fix for qla
> > 
> > The previous patch missed one occurence of pci_enable_device_bars()
> > in the qla2xxx driver. This fixes it.
> 
> Should I just merge this with your 2/3 patch so everything is sane?

Sure.

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] Add GD-Rom support to the SEGA Dreamcast

2007-12-20 Thread Paul Mundt
On Thu, Dec 20, 2007 at 09:53:54PM +, Adrian McMenamin wrote:
> On 16/12/2007, Paul Mundt <[EMAIL PROTECTED]> wrote:
> > Also, __devinit/__devexit annotations?
> >
> 
> Is there any difference between __init and __devint?

Yes.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ecryptfs: check for existing key_tfm at mount time

2007-12-20 Thread Eric Sandeen
Jeff Moyer pointed out that a mount; umount loop of ecryptfs,
with the same cipher & other mount options, created a new 
ecryptfs_key_tfm_cache item each time, and the cache could
grow quite large this way.

Looking at this with mhalcrow, we saw that ecryptfs_parse_options()
unconditionally called ecryptfs_add_new_key_tfm(), which is what
was adding these items.

Refactor ecryptfs_get_tfm_and_mutex_for_cipher_name() to create a 
new helper function, ecryptfs_tfm_exists(), which checks for the 
cipher on the cached key_tfm_list, and sets a pointer
to it if it exists.  This can then be called from 
ecryptfs_parse_options(), and new key_tfm's can be added only when
a cached one is not found.

Signed-off-by: Eric Sandeen <[EMAIL PROTECTED]>
---

Index: linux-2.6.24-rc3/fs/ecryptfs/crypto.c
===
--- linux-2.6.24-rc3.orig/fs/ecryptfs/crypto.c
+++ linux-2.6.24-rc3/fs/ecryptfs/crypto.c
@@ -1868,6 +1868,33 @@ out:
return rc;
 }
 
+/**
+ * ecryptfs_tfm_exists - Search for existing tfm for cipher_name.
+ * @cipher_name: the name of the cipher to search for
+ * @key_tfm: set to corresponding tfm if found
+ *
+ * Returns 1 if found, with key_tfm set
+ * Returns 0 if not found, key_tfm set to NULL
+ */
+int ecryptfs_tfm_exists(char *cipher_name, struct ecryptfs_key_tfm **key_tfm)
+{
+   struct ecryptfs_key_tfm *tmp_key_tfm;
+
+   mutex_lock(_tfm_list_mutex);
+   list_for_each_entry(tmp_key_tfm, _tfm_list, key_tfm_list) {
+   if (strcmp(tmp_key_tfm->cipher_name, cipher_name) == 0) {
+   mutex_unlock(_tfm_list_mutex);
+   if (key_tfm)
+   (*key_tfm) = tmp_key_tfm;
+   return 1;
+   }
+   }
+   mutex_unlock(_tfm_list_mutex);
+   if (key_tfm)
+   (*key_tfm) = NULL;
+   return 0;
+}
+
 int ecryptfs_get_tfm_and_mutex_for_cipher_name(struct crypto_blkcipher **tfm,
   struct mutex **tfm_mutex,
   char *cipher_name)
@@ -1877,22 +1904,15 @@ int ecryptfs_get_tfm_and_mutex_for_ciphe
 
(*tfm) = NULL;
(*tfm_mutex) = NULL;
-   mutex_lock(_tfm_list_mutex);
-   list_for_each_entry(key_tfm, _tfm_list, key_tfm_list) {
-   if (strcmp(key_tfm->cipher_name, cipher_name) == 0) {
-   (*tfm) = key_tfm->key_tfm;
-   (*tfm_mutex) = _tfm->key_tfm_mutex;
-   mutex_unlock(_tfm_list_mutex);
+
+   if (!ecryptfs_tfm_exists(cipher_name, _tfm)) {
+   rc = ecryptfs_add_new_key_tfm(_tfm, cipher_name, 0);
+   if (rc) {
+   printk(KERN_ERR "Error adding new key_tfm to list; "
+   "rc = [%d]\n", rc);
goto out;
}
}
-   mutex_unlock(_tfm_list_mutex);
-   rc = ecryptfs_add_new_key_tfm(_tfm, cipher_name, 0);
-   if (rc) {
-   printk(KERN_ERR "Error adding new key_tfm to list; rc = [%d]\n",
-  rc);
-   goto out;
-   }
(*tfm) = key_tfm->key_tfm;
(*tfm_mutex) = _tfm->key_tfm_mutex;
 out:
Index: linux-2.6.24-rc3/fs/ecryptfs/ecryptfs_kernel.h
===
--- linux-2.6.24-rc3.orig/fs/ecryptfs/ecryptfs_kernel.h
+++ linux-2.6.24-rc3/fs/ecryptfs/ecryptfs_kernel.h
@@ -623,6 +623,7 @@ ecryptfs_add_new_key_tfm(struct ecryptfs
 size_t key_size);
 int ecryptfs_init_crypto(void);
 int ecryptfs_destroy_crypto(void);
+int ecryptfs_tfm_exists(char *cipher_name, struct ecryptfs_key_tfm **key_tfm);
 int ecryptfs_get_tfm_and_mutex_for_cipher_name(struct crypto_blkcipher **tfm,
   struct mutex **tfm_mutex,
   char *cipher_name);
Index: linux-2.6.24-rc3/fs/ecryptfs/main.c
===
--- linux-2.6.24-rc3.orig/fs/ecryptfs/main.c
+++ linux-2.6.24-rc3/fs/ecryptfs/main.c
@@ -410,9 +410,11 @@ static int ecryptfs_parse_options(struct
if (!cipher_key_bytes_set) {
mount_crypt_stat->global_default_cipher_key_size = 0;
}
-   rc = ecryptfs_add_new_key_tfm(
-   NULL, mount_crypt_stat->global_default_cipher_name,
-   mount_crypt_stat->global_default_cipher_key_size);
+   if (!ecryptfs_tfm_exists(mount_crypt_stat->global_default_cipher_name,
+NULL))
+   rc = ecryptfs_add_new_key_tfm(
+   NULL, mount_crypt_stat->global_default_cipher_name,
+   mount_crypt_stat->global_default_cipher_key_size);
if (rc) {
printk(KERN_ERR "Error attempting to initialize cipher with "
   "name = [%s] and key 

RE: [patch 05/24] Text Edit Lock - Architecture Independent Code

2007-12-20 Thread zhangxiliang
hello,
   I have some questions for your patches.

> Paravirt and alternatives are always done when SMP is
> inactive, so there is no
> need to use locks.

> -#ifndef CONFIG_KPROBES
> -#ifdef CONFIG_HOTPLUG_CPU
> - /* It must still be possible to apply SMP alternatives. */
> - if (num_possible_cpus() <= 1)
> -#endif
> - {
> - change_page_attr(virt_to_page(start),
> -  size >> PAGE_SHIFT, PAGE_KERNEL_RX);
> - printk("Write protecting the kernel text:
> %luk\n", size >> 10);
> - }
> -#endif
> + change_page_attr(virt_to_page(start),
> + size >> PAGE_SHIFT, PAGE_KERNEL_RX);
> + printk(KERN_INFO "Write protecting the kernel text: %luk\n",
> + size >> 10);
> +

Why "mark_rodata_ro" doesn't consider smp instance? Maybe it will be appied in 
future.


> ===
> --- linux-2.6-lttng.orig/kernel/kprobes.c 2007-12-12
> 18:10:32.0 -0500
> +++ linux-2.6-lttng/kernel/kprobes.c  2007-12-12
> 18:10:34.0 -0500
> @@ -644,7 +644,9 @@ valid_p:
>   list_del_rcu(>list);
>   kfree(old_p);
>   }
> + mutex_lock(_mutex);
>   arch_remove_kprobe(p);
> + mutex_unlock(_mutex);
>   } else {
>   mutex_lock(_mutex);
>   if (p->break_handler)

I think "mutex_lock" and "mutex_unlock" shoud be in architecture code.
In "__register_kprobe" funtion, its implement "arch_prepare_kprobe" and 
"arch_arm_kprobe" is also depended on arch.  So the remove implement is not 
the same on the different architecture code.

Maybe it doesn't need the mutex_lock in "arch_remove_kprobe" on some embeded 
system chips if linux can support the other embeded system chips in future.


> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of
> Mathieu Desnoyers
> Sent: Friday, December 21, 2007 9:55 AM
> To: [EMAIL PROTECTED]; Ingo Molnar;
> linux-kernel@vger.kernel.org
> Cc: Mathieu Desnoyers; Andi Kleen
> Subject: [patch 05/24] Text Edit Lock - Architecture Independent Code
>
> This is an architecture independant synchronization around kernel text
> modifications through use of a global mutex.
>
> A mutex has been chosen so that kprobes, the main user of
> this, can sleep during
> memory allocation between the memory read of the instructions
> it must replace
> and the memory write of the breakpoint.
>
> Other user of this interface: immediate values.
>
> Paravirt and alternatives are always done when SMP is
> inactive, so there is no
> need to use locks.
>
> Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
> CC: Andi Kleen <[EMAIL PROTECTED]>
> ---
>  include/linux/memory.h |7 +++
>  mm/memory.c|   34 ++
>  2 files changed, 41 insertions(+)
>
> Index: linux-2.6-lttng/include/linux/memory.h
> ===
> --- linux-2.6-lttng.orig/include/linux/memory.h> 
> 2007-11-07 11:11:26.0 -0500
> +++ linux-2.6-lttng/include/linux/memory.h2007-11-07
> 11:13:48.0 -0500
> @@ -93,4 +93,11 @@ extern int memory_notify(unsigned long v
>  #define hotplug_memory_notifier(fn, pri) do { } while (0)
>  #endif
>
> +/*
> + * Take and release the kernel text modification lock, used
> for code patching.
> + * Users of this lock can sleep.
> + */
> +extern void kernel_text_lock(void);
> +extern void kernel_text_unlock(void);
> +
>  #endif /* _LINUX_MEMORY_H_ */
> Index: linux-2.6-lttng/mm/memory.c
> ===
> --- linux-2.6-lttng.orig/mm/memory.c  2007-11-07
> 11:12:33.0 -0500
> +++ linux-2.6-lttng/mm/memory.c   2007-11-07
> 11:14:25.0 -0500
> @@ -50,6 +50,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>
>  #include 
>  #include 
> @@ -84,6 +86,12 @@ EXPORT_SYMBOL(high_memory);
>
>  int randomize_va_space __read_mostly = 1;
>
> +/*
> + * mutex protecting text section modification (dynamic code
> patching).
> + * some users need to sleep (allocating memory...) while
> they hold this lock.
> + */
> +static DEFINE_MUTEX(text_mutex);
> +
>  static int __init disable_randmaps(char *s)
>  {
>   randomize_va_space = 0;
> @@ -2748,3 +2756,29 @@ int access_process_vm(struct task_struct
>
>   return buf - old_buf;
>  }
> +
> +/**
> + * kernel_text_lock -   Take the kernel text modification lock
> + *
> + * Insures mutual write exclusion of kernel and modules text
> live text
> + * modification. Should be used for code patching.
> + * Users of this lock can sleep.
> + */
> +void __kprobes kernel_text_lock(void)
> +{
> + mutex_lock(_mutex);
> +}
> +EXPORT_SYMBOL_GPL(kernel_text_lock);
> +
> +/**
> + * kernel_text_unlock   -   Release the kernel text modification lock
> + *
> + * Insures mutual write exclusion of kernel and 

Re: [PATCH 3/3] pci: Remove pci_enable_device_bars()

2007-12-20 Thread Greg KH
On Fri, Dec 21, 2007 at 03:47:28PM +1100, Benjamin Herrenschmidt wrote:
> pci: Remove pci_enable_device_bars() fix for qla
> 
> The previous patch missed one occurence of pci_enable_device_bars()
> in the qla2xxx driver. This fixes it.

Should I just merge this with your 2/3 patch so everything is sane?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.24-rc6

2007-12-20 Thread Linus Torvalds


On Thu, 20 Dec 2007, Linus Torvalds wrote:
>
> And here's the git patch to avoid this optimization when there is 
> context.

Actually, the code to finding one '\n' is still needed to avoid the 
(pathological) case of getting a "\No newline", so scrap that one which 
was too aggressive, and use this (simpler) one instead.

Not that it matters in real life, since nobody uses -U0, and "git blame" 
won't care. But let's get it right anyway ;)

This whole function has had more bugs than it has lines.

Linus

---
 xdiff-interface.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/xdiff-interface.c b/xdiff-interface.c
index 9ee877c..711029e 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -115,15 +115,18 @@ static void trim_common_tail(mmfile_t *a, mmfile_t *b, 
long ctx)
char *bp = b->ptr + b->size;
long smaller = (a->size < b->size) ? a->size : b->size;
 
+   if (ctx)
+   return;
+
while (blk + trimmed <= smaller && !memcmp(ap - blk, bp - blk, blk)) {
trimmed += blk;
ap -= blk;
bp -= blk;
}
 
-   while (recovered < trimmed && 0 <= ctx)
+   while (recovered < trimmed)
if (ap[recovered++] == '\n')
-   ctx--;
+   break;
a->size -= (trimmed - recovered);
b->size -= (trimmed - recovered);
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: After many hours all outbound connections get stuck in SYN_SENT

2007-12-20 Thread Glen Turner

> I do have TCP Sequence # Randomization enabled on my router.

Huh?  Do you mean a PIX blade in a Cisco switch-router chassis? It
would be very useful if you could be less vague about the
equipment in use.

>  However,
> if this was causing an issue, wouldn't it always occur and cause
> connection issues, not just after 38 hours of correct operation?

That depends more on your customers' networking attributes
then you are sharing or perhaps even know.  Perhaps your customer
base is very Window-skewed and you simply aren't seeing any Sack
Permitted negotiations for the first 37.999 hours. Or
perhaps you've had a network glitch, and all of your
connections have done a Selective Ack, which the firewall
has trashed, leaving all the connections in a wacko state,
not just a few which you haven't noticed.

The actual failure mode needs a packet trace to determine,
but you should be able to do this yourself (or ask your
local network engineering staff).

If your firewall is trashing the Sack field, then it needs
to be fixed.  Time to raise a case with the Cisco TAC and
ask them directly if your PIX version has bug CSCse14419.
You can't expect Sack to work when it's being fed trash,
so it is important to make sure that is not happening.

Cheers, Glen
#include 
#undef KERNEL_HACKER

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.24-rc6

2007-12-20 Thread Kyle McMartin
On Thu, Dec 20, 2007 at 08:40:54PM -0800, Linus Torvalds wrote:
> That was a rather long-winded explanation of what happened, mainly because 
> it was all very unexpected to me, and I had personally mistakenly thought 
> the git optimization was perfectly valid and actually had to go through 
> the end result to see what was going on.
> 
> Anyway, the diff on kernel.org should be all ok now, and mirrored out too.
> 

Thanks again for being so quick to track this down, applies fine and is
out for building in rawhide now.

cheers, Kyle
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: iommu dma mapping alignment requirements

2007-12-20 Thread Steve Wise



Benjamin Herrenschmidt wrote:

Sounds good.  Thanks!

Note, that these smaller sub-host-page-sized mappings might pollute the 
address space causing full aligned host-page-size maps to become 
scarce...  Maybe there's a clever way to keep those in their own segment 
of the address space?


We already have a large vs. small split in the iommu virtual space to
alleviate this (though it's not a hard constraint, we can still get
into the "other" side if the default one is full).

Try that patch and let me know:


Seems to be working!

:)




Index: linux-work/arch/powerpc/kernel/iommu.c
===
--- linux-work.orig/arch/powerpc/kernel/iommu.c 2007-12-21 10:39:39.0 
+1100
+++ linux-work/arch/powerpc/kernel/iommu.c  2007-12-21 10:46:18.0 
+1100
@@ -278,6 +278,7 @@ int iommu_map_sg(struct iommu_table *tbl
unsigned long flags;
struct scatterlist *s, *outs, *segstart;
int outcount, incount, i;
+   unsigned int align;
unsigned long handle;
 
 	BUG_ON(direction == DMA_NONE);

@@ -309,7 +310,11 @@ int iommu_map_sg(struct iommu_table *tbl
/* Allocate iommu entries for that segment */
vaddr = (unsigned long) sg_virt(s);
npages = iommu_num_pages(vaddr, slen);
-   entry = iommu_range_alloc(tbl, npages, , mask >> 
IOMMU_PAGE_SHIFT, 0);
+   align = 0;
+   if (IOMMU_PAGE_SHIFT < PAGE_SHIFT && (vaddr & ~PAGE_MASK) == 0)
+   align = PAGE_SHIFT - IOMMU_PAGE_SHIFT;
+   entry = iommu_range_alloc(tbl, npages, ,
+ mask >> IOMMU_PAGE_SHIFT, align);
 
 		DBG("  - vaddr: %lx, size: %lx\n", vaddr, slen);
 
@@ -572,7 +577,7 @@ dma_addr_t iommu_map_single(struct iommu

 {
dma_addr_t dma_handle = DMA_ERROR_CODE;
unsigned long uaddr;
-   unsigned int npages;
+   unsigned int npages, align;
 
 	BUG_ON(direction == DMA_NONE);
 
@@ -580,8 +585,13 @@ dma_addr_t iommu_map_single(struct iommu

npages = iommu_num_pages(uaddr, size);
 
 	if (tbl) {

+   align = 0;
+   if (IOMMU_PAGE_SHIFT < PAGE_SHIFT &&
+   ((unsigned long)vaddr & ~PAGE_MASK) == 0)
+   align = PAGE_SHIFT - IOMMU_PAGE_SHIFT;
+
dma_handle = iommu_alloc(tbl, vaddr, npages, direction,
-mask >> IOMMU_PAGE_SHIFT, 0);
+mask >> IOMMU_PAGE_SHIFT, align);
if (dma_handle == DMA_ERROR_CODE) {
if (printk_ratelimit())  {
printk(KERN_INFO "iommu_alloc failed, "


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] pci: Remove pci_enable_device_bars()

2007-12-20 Thread Benjamin Herrenschmidt
pci: Remove pci_enable_device_bars() fix for qla

The previous patch missed one occurence of pci_enable_device_bars()
in the qla2xxx driver. This fixes it.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

Index: linux-merge/drivers/scsi/qla2xxx/qla_def.h
===
--- linux-merge.orig/drivers/scsi/qla2xxx/qla_def.h 2007-12-21 
15:45:41.0 +1100
+++ linux-merge/drivers/scsi/qla2xxx/qla_def.h  2007-12-21 15:46:12.0 
+1100
@@ -2272,6 +2272,7 @@ typedef struct scsi_qla_host {
spinlock_t  hardware_lock cacheline_aligned;
 
int bars;
+   int mem_only;
device_reg_t __iomem *iobase;   /* Base I/O address */
unsigned long   pio_address;
unsigned long   pio_length;
Index: linux-merge/drivers/scsi/qla2xxx/qla_os.c
===
--- linux-merge.orig/drivers/scsi/qla2xxx/qla_os.c  2007-12-21 
15:46:10.0 +1100
+++ linux-merge/drivers/scsi/qla2xxx/qla_os.c   2007-12-21 15:46:12.0 
+1100
@@ -1626,6 +1626,7 @@ qla2x00_probe_one(struct pci_dev *pdev, 
sprintf(ha->host_str, "%s_%ld", QLA2XXX_DRIVER_NAME, ha->host_no);
ha->parent = NULL;
ha->bars = bars;
+   ha->mem_only = mem_only;
 
/* Set ISP-type information. */
qla2x00_set_isp_flags(ha);
@@ -2905,8 +2906,14 @@ qla2xxx_pci_slot_reset(struct pci_dev *p
 {
pci_ers_result_t ret = PCI_ERS_RESULT_DISCONNECT;
scsi_qla_host_t *ha = pci_get_drvdata(pdev);
+   int rc;
 
-   if (pci_enable_device_bars(pdev, ha->bars)) {
+   if (ha->mem_only)
+   rc = pci_enable_device_mem(pdev);
+   else
+   rc = pci_enable_device(pdev);
+
+   if (rc) {
qla_printk(KERN_WARNING, ha,
"Can't re-enable PCI device after reset.\n");
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-stable causes oomkiller to be invoked

2007-12-20 Thread Dhaval Giani
> > It was just
> > 
> > while echo ; do cat /sys/kernel/ ; done
> > 
> > it's all in the email threads somewhere..
> 
> The patch that was posted in the thread that I mentioned earlier is here. 
> I ran the test for 15 minutes and things are still fine.
> 
> 
> 
> quicklist: Set tlb->need_flush if pages are remaining in quicklist 0
> 
> This ensures that the quicklists are drained. Otherwise draining may only 
> occur when the processor reaches an idle state.
> 

Hi Christoph,

No, it does not stop the oom I am seeing here.

Thanks,

> Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
> 
> Index: linux-2.6/include/asm-generic/tlb.h
> ===
> --- linux-2.6.orig/include/asm-generic/tlb.h  2007-12-13 14:45:38.0 
> -0800
> +++ linux-2.6/include/asm-generic/tlb.h   2007-12-13 14:51:07.0 
> -0800
> @@ -14,6 +14,7 @@
>  #define _ASM_GENERIC__TLB_H
> 
>  #include 
> +#include 
>  #include 
>  #include 
> 
> @@ -85,6 +86,9 @@ tlb_flush_mmu(struct mmu_gather *tlb, un
>  static inline void
>  tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long 
> end)
>  {
> +#ifdef CONFIG_QUICKLIST
> + tlb->need_flush += &__get_cpu_var(quicklist)[0].nr_pages != 0;
> +#endif
>   tlb_flush_mmu(tlb, start, end);
> 
>   /* keep the page table cache within bounds */

-- 
regards,
Dhaval
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] pci: Remove pci_enable_device_bars()

2007-12-20 Thread Benjamin Herrenschmidt

On Thu, 2007-12-20 at 20:28 -0800, Greg KH wrote:
> On Thu, Dec 20, 2007 at 03:28:10PM +1100, Benjamin Herrenschmidt wrote:
> > Now that all in-tree users are gone, this removes pci_enable_device_bars()
> > completely.
> 
> Hm, looks like you missed drivers/scsi/qla2xxx/qla_os.c
> 
> Quick, before akpm gets mad at you for breaking the build, send me a
> patch!  :)

Argh... there was 2 users in that file and I fixed only one...

Followup patch in a blink.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.24-rc6

2007-12-20 Thread Linus Torvalds


On Thu, 20 Dec 2007, Linus Torvalds wrote:
> 
> It only happened for a few files that had lots of repeated lines - so that 
> the diff could literally be done multiple different ways - and in fact, 
> the file that caused the problems really had a bogus commit that 
> duplicated *way* too much data, and caused lots of #define's to exist 
> twice.

Here's the example of this kind of behaviour: in the 2.6.26-rc5 tree the 
file drivers/video/mbx/reg_bits.h has the #defines for 

/* DINTRS - Display Interrupt Status Register */
/* DINTRE - Display Interrupt Enable Register */

duplicated twice due to commit ba282daa919f89c871780f344a71e5403a70b634 
("mbxfb: Improvements and new features") by Raphael Assenat mistakenly 
adding another copy of the same old set of defines that we already got 
added once before by commit fb137d5b7f2301f2717944322bba38039083c431 
("mbxfb: Add more registers bits access macros").

Now, that was a mistake - and one that probably happened because Rafael or 
more likely Andrew Morton used GNU patch with its insane defaults (which 
is to happily apply the same patch that adds things twice, because it 
doesn't really care if the context matches or not).

But what that kind of thing causes is that when you create a patch of the 
end result, it can show the now new duplicate lines two different (but 
equally valid) ways: it can show it as an addition of the _first_ set of 
lines, or it can show it as an addition of the _second_ set of lines. They 
are the same, after all.

Now, it doesn't really matter which way you choose to show it, although 
because of how "git diff" finds similarities, it tends to prefer to show 
the second set of identical lines as the "new" ones. Which is generally 
reasonable.

However, that interacted really badly with the new git logic that said 
that "if the two files end in the same sequence, just ignore the common 
tail of the file", because the latter copy of the identical lines would 
now show up as _part_ of that common tail, so the lines that the git diff 
machinery would normally like to show up as "new" did in fact end up being 
considered uninteresting, because they were part of an idential tail. 

So now "git diff" would happily pick _earlier_ lines as the new ones, and 
it would still be a conceptually valid diff, but because we had trimmed 
the tail of the file, that conceptually valid diff no longer had the 
expected shared context at the end.

And while it's a bit embarrassing, I'm really rather happy that both GNU 
patch and "git apply" actually refused to apply the patch. It may have 
been "conceptually correct" (ie it did really contain all of the changes!) 
but because it lacked the expected context it really wasn't a good patch. 

That was a rather long-winded explanation of what happened, mainly because 
it was all very unexpected to me, and I had personally mistakenly thought 
the git optimization was perfectly valid and actually had to go through 
the end result to see what was going on.

Anyway, the diff on kernel.org should be all ok now, and mirrored out too.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/5] sg_ring for scsi

2007-12-20 Thread FUJITA Tomonori
On Fri, 21 Dec 2007 14:26:47 +1100
Rusty Russell <[EMAIL PROTECTED]> wrote:

> On Friday 21 December 2007 13:28:34 FUJITA Tomonori wrote:
> > I'm not sure about chaining the headers (as your sg_ring and
> > scsi_sgtable do) would simplify LLDs. Have you looked at ips or
> > qla1280?
> 
> Not yet, am working my way through the drivers, but I don't expect it will be 
> a simplification to the normal SCSI LLDs.  Most of them are mere consumers of
> sgs...

Some scsi drivers like ips access to sglist in a tricky way. I feel
that they don't work with the sg_ring interface well. So if you
convert scsi_lib.c to use sg_ring, please see how it works with the
tricky drivers before that.


> I'm not a SCSI person: I'm patching SCSI because I have to to get my
> own sg-using code clean :)

I'm SCSI-biased. If you don't convert scsi to use sg_ring, I don't
complain. :) Though it would be better to have only one mechanism to
handle large sglist in kernel.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] pci: Remove pci_enable_device_bars()

2007-12-20 Thread Greg KH
On Thu, Dec 20, 2007 at 03:28:10PM +1100, Benjamin Herrenschmidt wrote:
> Now that all in-tree users are gone, this removes pci_enable_device_bars()
> completely.

Hm, looks like you missed drivers/scsi/qla2xxx/qla_os.c

Quick, before akpm gets mad at you for breaking the build, send me a
patch!  :)

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.24-rc6

2007-12-20 Thread Linus Torvalds


On Thu, 20 Dec 2007, Linus Torvalds wrote:
> 
> The tar-ball and the git archive itself is fine, but yes, the diff from 
> 2.6.23 to 2.6.24-rc6 is bad. It's the "trim_common_tail()" optimization 
> that has caused way too much pain.

Very interesting breakage. The patch was actually "correct" in a (rather 
limited) technical sense, but the context at the end was missing because 
while the trim_common_tail() code made sure to keep enough common context 
to allow a valid diff to be generated, the diff machinery itself could 
decide that it could generate the diff differently than the "obvious" 
solution.

It only happened for a few files that had lots of repeated lines - so that 
the diff could literally be done multiple different ways - and in fact, 
the file that caused the problems really had a bogus commit that 
duplicated *way* too much data, and caused lots of #define's to exist 
twice.

But the sad fact appears that the git optimization (which is very 
important for "git blame", which needs no context), is only really valid 
for that one case where we really don't need any context.

I uploaded a fixed patch. And here's the git patch to avoid this 
optimization when there is context.

Linus

---
 xdiff-interface.c |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/xdiff-interface.c b/xdiff-interface.c
index 9ee877c..0b7e057 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -110,22 +110,22 @@ int xdiff_outf(void *priv_, mmbuffer_t *mb, int nbuf)
 static void trim_common_tail(mmfile_t *a, mmfile_t *b, long ctx)
 {
const int blk = 1024;
-   long trimmed = 0, recovered = 0;
+   long trimmed = 0;
char *ap = a->ptr + a->size;
char *bp = b->ptr + b->size;
long smaller = (a->size < b->size) ? a->size : b->size;
 
+   if (ctx)
+   return;
+
while (blk + trimmed <= smaller && !memcmp(ap - blk, bp - blk, blk)) {
trimmed += blk;
ap -= blk;
bp -= blk;
}
 
-   while (recovered < trimmed && 0 <= ctx)
-   if (ap[recovered++] == '\n')
-   ctx--;
-   a->size -= (trimmed - recovered);
-   b->size -= (trimmed - recovered);
+   a->size -= trimmed;
+   b->size -= trimmed;
 }
 
 int xdi_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp, xdemitconf_t 
const *xecfg, xdemitcb_t *xecb)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Loic Prylli wrote:


Just curious, do you know of any system where that recommendation was
not followed? On all motherboards where I have seen a AMD-8131 or a
AMD-8132, they were alone on their hypertransport link, and other
"northbridges" (more precisely hypertransport to pci-express or
pci-whatever, often nvidia) with a "MMCONFIG BAR" where on one of the
other available hypertransport links in the system.


Loic



Here is the PCI configuration of the HP DL585G2.

You can see two nVidia CK804 PCIE root ports at bus 0 and bus 0x40.
Each of them has an 8132 connected as a subordinate bridge.

[EMAIL PROTECTED] ~]# lspci -vt
-+-[:40]-+-00.0  nVidia Corporation CK804 Memory Controller
 |   +-01.0  nVidia Corporation CK804 Memory Controller
 |   +-0b.0-[:4f-51]--
 |   +-0c.0-[:4c-4e]--
 |   +-0d.0-[:49-4b]--
 |   +-0e.0-[:46-48]--
 |   +-10.0-[:41]--+-01.0  Broadcom Corporation NetXtreme II 
BCM5706 Gigabit Ethernet
 |   | \-02.0  Broadcom Corporation NetXtreme II 
BCM5706 Gigabit Ethernet
 |   +-10.1  Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC
 |   +-11.0-[:42-45]--
 |   \-11.1  Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC
 \-[:00]-+-00.0  nVidia Corporation CK804 Memory Controller
 +-01.0  nVidia Corporation CK804 ISA Bridge
 +-02.0  nVidia Corporation CK804 USB Controller
 +-02.1  nVidia Corporation CK804 USB Controller
 +-06.0  nVidia Corporation CK804 IDE
 +-09.0-[:01]--+-03.0  ATI Technologies Inc ES1000
 | +-04.0  Compaq Computer Corporation Integrated 
Lights Out Controller
 | +-04.2  Compaq Computer Corporation Integrated 
Lights Out  Processor
 | +-04.4  Hewlett-Packard Company Proliant iLO2 
virtual USB controller
 | \-04.6  Hewlett-Packard Company Proliant iLO2 
virtual UART
 +-0c.0-[:08-0a]00.0  Hewlett-Packard Company Smart Array 
Controller
 +-0d.0-[:05-07]--
 +-0e.0-[:02-04]--
 +-18.0  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
 +-18.1  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
 +-18.2  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
 +-18.3  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
 +-19.0  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
 +-19.1  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
 +-19.2  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
 +-19.3  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
 +-1a.0  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
 +-1a.1  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
 +-1a.2  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
 +-1a.3  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
 +-1b.0  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
 +-1b.1  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
 +-1b.2  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
 \-1b.3  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
[EMAIL PROTECTED] ~]#













--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.24-rc6

2007-12-20 Thread Linus Torvalds


On Thu, 20 Dec 2007, Kyle McMartin wrote:
> 
> I think I see the problem, it's lack of context in the diff,

No, the problem is that "git diff" is apparently broken by a recent 
optimization. The diff is simply broken.

The tar-ball and the git archive itself is fine, but yes, the diff from 
2.6.23 to 2.6.24-rc6 is bad. It's the "trim_common_tail()" optimization 
that has caused way too much pain.

Sorry about that, I'll fix it up asap.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.24-rc6

2007-12-20 Thread Kyle McMartin
On Thu, Dec 20, 2007 at 07:49:05PM -0800, Linus Torvalds wrote:
> 
> 
> On Thu, 20 Dec 2007, Kyle McMartin wrote:
> > 
> > I think I see the problem, it's lack of context in the diff,
> 
> No, the problem is that "git diff" is apparently broken by a recent 
> optimization. The diff is simply broken.
> 
> The tar-ball and the git archive itself is fine, but yes, the diff from 
> 2.6.23 to 2.6.24-rc6 is bad. It's the "trim_common_tail()" optimization 
> that has caused way too much pain.
> 
> Sorry about that, I'll fix it up asap.
> 

no biggie, thanks!

cheers, Kyle
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli
On 12/20/2007 9:15 PM, Robert Hancock wrote:
>>
>> Suggested Workaround
>>
>> It is strongly recommended that system designers do not connect the
>> AMD-8132 and devices that use extended
>> configuration space MMIO BARs (ex: HyperTransport-to-PCI Express®
>> bridges) to the same processor
>> HyperTransport link.
>>
>> Fix Planned
>> No
>
> That does sound fairly definitive. I have to wonder why certain system
> designers then didn't follow their strong recommendation..



Just curious, do you know of any system where that recommendation was
not followed? On all motherboards where I have seen a AMD-8131 or a
AMD-8132, they were alone on their hypertransport link, and other
"northbridges" (more precisely hypertransport to pci-express or
pci-whatever, often nvidia) with a "MMCONFIG BAR" where on one of the
other available hypertransport links in the system.


Loic

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 14/24] Immediate Values - x86 Optimization

2007-12-20 Thread H. Peter Anvin

Mathieu Desnoyers wrote:


Argh.. Rusty asked to have a simplified version first, and then to
implement the "more complex" one on top of it. However, in order to get
the reentrancy I need for the markers, I need the complex version of the
immediate values. Therefore, you find, in this patchset, the simple
version first, and then, the more complex one implemented on top.

About this patch header, the initial idea was to use the "Q" and "R"
constraints, but, as stated just below, the "q" and "r" constraints are
used instead to make sure the REX prefixed opcodes for 1, 2, and 4 bytes
immediate values are never used. So the complete header follows the
source code, it's just that this paragraph could be clearer.



Then you have it backwards.  "Q" and "R" avoid REX prefixes, "q" and "r" 
DO NOT.


-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/5] sg_ring for scsi

2007-12-20 Thread Rusty Russell
On Friday 21 December 2007 13:28:34 FUJITA Tomonori wrote:
> I'm not sure about chaining the headers (as your sg_ring and
> scsi_sgtable do) would simplify LLDs. Have you looked at ips or
> qla1280?

Not yet, am working my way through the drivers, but I don't expect it will be 
a simplification to the normal SCSI LLDs.  Most of them are mere consumers of
sgs...

I'm not a SCSI person: I'm patching SCSI because I have to to get my own 
sg-using code clean :)

Hope that clarifies,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mbx: Fix up duplicate defines in reg_bits.h

2007-12-20 Thread Kyle McMartin
Otherwise patch gets horribly confused and falls over applying
the diff. Not sure why these were being defined twice.

Signed-off-by: Kyle McMartin <[EMAIL PROTECTED]>
---
Well, we can get it fixed for -git1, I respun the patch-2.6.24-rc6 diff
with git diff -p v2.6.23..HEAD and applied it to a pristine linux-2.6.23
tree without issue.

cheers,
Kyle

 drivers/video/mbx/reg_bits.h |   24 
 1 files changed, 0 insertions(+), 24 deletions(-)

diff --git a/drivers/video/mbx/reg_bits.h b/drivers/video/mbx/reg_bits.h
index 5f14b4b..8dc4283 100644
--- a/drivers/video/mbx/reg_bits.h
+++ b/drivers/video/mbx/reg_bits.h
@@ -540,30 +540,6 @@
 #define DINTRE_HBLNK1_EN   (1 << 1)
 #define DINTRE_HBLNK0_EN   (1 << 0)
 
-/* DINTRS - Display Interrupt Status Register */
-#define DINTRS_CUR_OR_S(1 << 18)
-#define DINTRS_STR2_OR_S   (1 << 17)
-#define DINTRS_STR1_OR_S   (1 << 16)
-#define DINTRS_CUR_UR_S(1 << 6)
-#define DINTRS_STR2_UR_S   (1 << 5)
-#define DINTRS_STR1_UR_S   (1 << 4)
-#define DINTRS_VEVENT1_S   (1 << 3)
-#define DINTRS_VEVENT0_S   (1 << 2)
-#define DINTRS_HBLNK1_S(1 << 1)
-#define DINTRS_HBLNK0_S(1 << 0)
-
-/* DINTRE - Display Interrupt Enable Register */
-#define DINTRE_CUR_OR_EN   (1 << 18)
-#define DINTRE_STR2_OR_EN  (1 << 17)
-#define DINTRE_STR1_OR_EN  (1 << 16)
-#define DINTRE_CUR_UR_EN   (1 << 6)
-#define DINTRE_STR2_UR_EN  (1 << 5)
-#define DINTRE_STR1_UR_EN  (1 << 4)
-#define DINTRE_VEVENT1_EN  (1 << 3)
-#define DINTRE_VEVENT0_EN  (1 << 2)
-#define DINTRE_HBLNK1_EN   (1 << 1)
-#define DINTRE_HBLNK0_EN   (1 << 0)
-
 
 /* DLSTS - display load status register */
 #define DLSTS_RLD_ADONE(1 << 23)
-- 
1.5.3.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 14/24] Immediate Values - x86 Optimization

2007-12-20 Thread Mathieu Desnoyers
* H. Peter Anvin ([EMAIL PROTECTED]) wrote:
> This patch is modified by another patch in the sequence.  This feels 
> needlessly confusing when reviewing (especially since the comment doesn't 
> look to match the code, e.g. w.r.t to "Q" and "R" constraints); can you 
> reorder the patchset to avoid that?
>

Argh.. Rusty asked to have a simplified version first, and then to
implement the "more complex" one on top of it. However, in order to get
the reentrancy I need for the markers, I need the complex version of the
immediate values. Therefore, you find, in this patchset, the simple
version first, and then, the more complex one implemented on top.

About this patch header, the initial idea was to use the "Q" and "R"
constraints, but, as stated just below, the "q" and "r" constraints are
used instead to make sure the REX prefixed opcodes for 1, 2, and 4 bytes
immediate values are never used. So the complete header follows the
source code, it's just that this paragraph could be clearer.

Mathieu

>   -hpa

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-20 Thread David Miller
From: Matt Mackall <[EMAIL PROTECTED]>
Date: Thu, 20 Dec 2007 19:06:55 -0600

> @@ -707,7 +707,10 @@ static ssize_t kpagecount_read(struct fi
>   return -EIO;
>  
>   while (count > 0) {
> - ppage = pfn_to_page(pfn++);
> + ppage = 0;
> + if (pfn_valid(pfn))
> + ppage = pfn_to_page(pfn);
> + pfn++;
>   if (!ppage)
>   pcount = 0;
>   else

Yes that should work, please use "NULL" in the final
version of the patch instead of "0" so that sparse is
happy.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Trailing periods in kernel messages

2007-12-20 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Thu, 20 Dec 2007 18:15:32 -0800

> No-period is a kernel idiom, produces perfectly readable output, I have
> never ever heard of anyone expressing the least concern over a lack of dots
> at the end of their printks and 91% of kernel code agrees.

I have never heard of a compiler expressing the least concern over
whitespace and other aspects of coding style.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/10] sysfs network namespace support

2007-12-20 Thread Greg KH
On Sat, Dec 01, 2007 at 02:06:58AM -0700, Eric W. Biederman wrote:
> 
> Now that we have network namespace support merged it is time to
> revisit the sysfs support so we can remove the dependency on !SYSFS.



Oops, I forgot to apply this to my tree.  Eric, you still want this
submitted, right?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 14/24] Immediate Values - x86 Optimization

2007-12-20 Thread H. Peter Anvin
This patch is modified by another patch in the sequence.  This feels 
needlessly confusing when reviewing (especially since the comment 
doesn't look to match the code, e.g. w.r.t to "Q" and "R" constraints); 
can you reorder the patchset to avoid that?


-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


patch pci-remove-users-of-pci_enable_device_bars.patch added to gregkh-2.6 tree

2007-12-20 Thread gregkh

This is a note to let you know that I've just added the patch titled

 Subject: PCI: Remove users of pci_enable_device_bars()

to my gregkh-2.6 tree.  Its filename is

 pci-remove-users-of-pci_enable_device_bars.patch

This tree can be found at 
http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/


>From [EMAIL PROTECTED] Wed Dec 19 20:30:44 2007
From: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
Date: Thu, 20 Dec 2007 15:28:09 +1100
Subject: PCI: Remove users of pci_enable_device_bars()
To: Greg Kroah-Hartman <[EMAIL PROTECTED]>
Cc: [EMAIL PROTECTED], , <[EMAIL PROTECTED]>, 
Alan Cox <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, 
Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>


This patch converts users of pci_enable_device_bars() to the new
pci_enable_device_{io,mem} interface.

The new API fits nicely, except maybe for the QLA case where a bit of
code re-organization might be a good idea but I prefer sticking to the
simple patch as I don't have hardware to test on.

I'll also need some feedback on the cs5520 change.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/ata/pata_cs5520.c   |2 +-
 drivers/i2c/busses/scx200_acb.c |2 +-
 drivers/ide/pci/cs5520.c|   10 --
 drivers/ide/setup-pci.c |6 --
 drivers/scsi/lpfc/lpfc_init.c   |3 +--
 drivers/scsi/qla2xxx/qla_os.c   |   12 +---
 6 files changed, 24 insertions(+), 11 deletions(-)

--- a/drivers/ata/pata_cs5520.c
+++ b/drivers/ata/pata_cs5520.c
@@ -229,7 +229,7 @@ static int __devinit cs5520_init_one(str
return -ENOMEM;
 
/* Perform set up for DMA */
-   if (pci_enable_device_bars(pdev, 1<<2)) {
+   if (pci_enable_device_io(pdev)) {
printk(KERN_ERR DRV_NAME ": unable to configure BAR2.\n");
return -ENODEV;
}
--- a/drivers/i2c/busses/scx200_acb.c
+++ b/drivers/i2c/busses/scx200_acb.c
@@ -492,7 +492,7 @@ static __init int scx200_create_pci(cons
iface->pdev = pdev;
iface->bar = bar;
 
-   rc = pci_enable_device_bars(iface->pdev, 1 << iface->bar);
+   rc = pci_enable_device_io(iface->pdev);
if (rc)
goto errout_free;
 
--- a/drivers/ide/pci/cs5520.c
+++ b/drivers/ide/pci/cs5520.c
@@ -160,8 +160,14 @@ static int __devinit cs5520_init_one(str
ide_setup_pci_noise(dev, d);
 
/* We must not grab the entire device, it has 'ISA' space in its
-  BARS too and we will freak out other bits of the kernel */
-   if (pci_enable_device_bars(dev, 1<<2)) {
+* BARS too and we will freak out other bits of the kernel
+*
+* pci_enable_device_bars() is going away. I replaced it with
+* IO only enable for now but I'll need confirmation this is
+* allright for that device. If not, it will need some kind of
+* quirk. --BenH.
+*/
+   if (pci_enable_device_io(dev)) {
printk(KERN_WARNING "%s: Unable to enable 55x0.\n", d->name);
return -ENODEV;
}
--- a/drivers/ide/setup-pci.c
+++ b/drivers/ide/setup-pci.c
@@ -236,7 +236,9 @@ EXPORT_SYMBOL_GPL(ide_setup_pci_noise);
  * @d: IDE port info
  *
  * Enable the IDE PCI device. We attempt to enable the device in full
- * but if that fails then we only need BAR4 so we will enable that.
+ * but if that fails then we only need IO space. The PCI code should
+ * have setup the proper resources for us already for controllers in
+ * legacy mode.
  * 
  * Returns zero on success or an error code
  */
@@ -246,7 +248,7 @@ static int ide_pci_enable(struct pci_dev
int ret;
 
if (pci_enable_device(dev)) {
-   ret = pci_enable_device_bars(dev, 1 << 4);
+   ret = pci_enable_device_io(dev);
if (ret < 0) {
printk(KERN_WARNING "%s: (ide_setup_pci_device:) "
"Could not enable device.\n", d->name);
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -2100,10 +2100,9 @@ static pci_ers_result_t lpfc_io_slot_res
struct Scsi_Host *shost = pci_get_drvdata(pdev);
struct lpfc_hba *phba = ((struct lpfc_vport *)shost->hostdata)->phba;
struct lpfc_sli *psli = >sli;
-   int bars = pci_select_bars(pdev, IORESOURCE_MEM);
 
dev_printk(KERN_INFO, >dev, "recovering from a slot reset.\n");
-   if (pci_enable_device_bars(pdev, bars)) {
+   if (pci_enable_device_mem(pdev)) {
printk(KERN_ERR "lpfc: Cannot re-enable "
"PCI device after reset.\n");
return PCI_ERS_RESULT_DISCONNECT;
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1583,7 +1583,7 @@ qla2x00_probe_one(struct pci_dev *pdev, 
char pci_info[30];
char fw_str[30];

Re: Linux 2.6.24-rc6

2007-12-20 Thread Kyle McMartin
On Thu, Dec 20, 2007 at 09:48:05PM -0500, Kyle McMartin wrote:
> 1 out of 3 hunks FAILED -- saving rejects to file
> drivers/video/mbx/reg_bits.h.rej
> error: Bad exit status from /var/tmp/rpm-tmp.22316 (%prep)
> 

I think I see the problem, it's lack of context in the diff,

commit ba282daa919f89c871780f344a71e5403a70b634
Author: Raphael Assenat <[EMAIL PROTECTED]>
Date:   Tue Oct 16 01:28:40 2007 -0700

seems to duplicate the DINTRS & DINTRE defines for no obvious reason,
confusing the hell out of patch.

regards,
Kyle
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


patch pci-remove-pci_enable_device_bars.patch added to gregkh-2.6 tree

2007-12-20 Thread gregkh

This is a note to let you know that I've just added the patch titled

 Subject: PCI: Remove pci_enable_device_bars()

to my gregkh-2.6 tree.  Its filename is

 pci-remove-pci_enable_device_bars.patch

This tree can be found at 
http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/


>From [EMAIL PROTECTED] Wed Dec 19 20:30:57 2007
From: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
Date: Thu, 20 Dec 2007 15:28:10 +1100
Subject: PCI: Remove pci_enable_device_bars()
To: Greg Kroah-Hartman <[EMAIL PROTECTED]>
Cc: [EMAIL PROTECTED], , <[EMAIL PROTECTED]>, 
Alan Cox <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, 
Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>


Now that all in-tree users are gone, this removes pci_enable_device_bars()
completely.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/pci/pci.c   |   24 
 include/linux/pci.h |1 -
 2 files changed, 25 deletions(-)

--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -741,29 +741,6 @@ int pci_reenable_device(struct pci_dev *
return 0;
 }
 
-/**
- * pci_enable_device_bars - Initialize some of a device for use
- * @dev: PCI device to be initialized
- * @bars: bitmask of BAR's that must be configured
- *
- *  Initialize device before it's used by a driver. Ask low-level code
- *  to enable selected I/O and memory resources. Wake up the device if it
- *  was suspended. Beware, this function can fail.
- */
-int
-pci_enable_device_bars(struct pci_dev *dev, int bars)
-{
-   int err;
-
-   if (atomic_add_return(1, >enable_cnt) > 1)
-   return 0;   /* already enabled */
-
-   err = do_pci_enable_device(dev, bars);
-   if (err < 0)
-   atomic_dec(>enable_cnt);
-   return err;
-}
-
 static int __pci_enable_device_flags(struct pci_dev *dev,
 resource_size_t flags)
 {
@@ -1695,7 +1672,6 @@ early_param("pci", pci_setup);
 device_initcall(pci_init);
 
 EXPORT_SYMBOL(pci_reenable_device);
-EXPORT_SYMBOL(pci_enable_device_bars);
 EXPORT_SYMBOL(pci_enable_device_io);
 EXPORT_SYMBOL(pci_enable_device_mem);
 EXPORT_SYMBOL(pci_enable_device);
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -543,7 +543,6 @@ static inline int pci_write_config_dword
 }
 
 int __must_check pci_enable_device(struct pci_dev *dev);
-int __must_check pci_enable_device_bars(struct pci_dev *dev, int mask);
 int __must_check pci_enable_device_io(struct pci_dev *dev);
 int __must_check pci_enable_device_mem(struct pci_dev *dev);
 int __must_check pci_reenable_device(struct pci_dev *);


Patches currently in gregkh-2.6 which might be from [EMAIL PROTECTED] are

bad/battery-class-driver.patch
driver/adb-convert-from-class_device-to-device.patch
driver/kobject-convert-hvc_console-to-use-kref-not-kobject.patch
driver/kobject-convert-hvcs-to-use-kref-not-kobject.patch
driver/kobject-convert-icom-to-use-kref-not-kobject.patch
pci/pci-fix-bus-resource-assignment-on-32-bits-with-64b-resources.patch
pci/pci-fix-warning-in-setup-res.c-on-32-bit-platforms-with-64-bit-resources.patch
pci/pci-add-pci_enable_device_-io-mem-intefaces.patch
pci/pci-remove-pci_enable_device_bars.patch
pci/pci-remove-users-of-pci_enable_device_bars.patch
usb/usb-remove-ohci-useless-masking-unmasking-of-wdh-interrupt.patch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


patch pci-add-pci_enable_device_-io-mem-intefaces.patch added to gregkh-2.6 tree

2007-12-20 Thread gregkh

This is a note to let you know that I've just added the patch titled

 Subject: PCI: Add pci_enable_device_{io,mem} intefaces

to my gregkh-2.6 tree.  Its filename is

 pci-add-pci_enable_device_-io-mem-intefaces.patch

This tree can be found at 
http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/


>From [EMAIL PROTECTED] Wed Dec 19 20:30:44 2007
From: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
Date: Thu, 20 Dec 2007 15:28:08 +1100
Subject: PCI: Add pci_enable_device_{io,mem} intefaces
To: Greg Kroah-Hartman <[EMAIL PROTECTED]>
Cc: [EMAIL PROTECTED], , <[EMAIL PROTECTED]>, 
Alan Cox <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, 
Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>


The pci_enable_device_bars() interface isn't well suited to PCI
because you can't actually enable/disable BARs individually on
a device. So for example, if a device has 2 memory BARs 0 and 1,
and one of them (let's say 1) has not been successfully allocated
by the firmware or the kernel, then enabling memory decoding
shouldn't be permitted for the entire device since it will decode
whatever random address is still in that BAR 1.

So a device must be either fully enabled for IO, for Memory, or
for both. Not on a per-BAR basis.

This provides two new functions, pci_enable_device_io() and
pci_enable_device_mem() to replace pci_enable_device_bars(). The
implementation internally builds a BAR mask in order to be able
to use existing arch infrastructure.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
Acked-by: Ivan Kokshaysky <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/pci/pci.c   |   49 -
 include/linux/pci.h |2 ++
 2 files changed, 50 insertions(+), 1 deletion(-)

--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -764,6 +764,51 @@ pci_enable_device_bars(struct pci_dev *d
return err;
 }
 
+static int __pci_enable_device_flags(struct pci_dev *dev,
+resource_size_t flags)
+{
+   int err;
+   int i, bars = 0;
+
+   if (atomic_add_return(1, >enable_cnt) > 1)
+   return 0;   /* already enabled */
+
+   for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
+   if (dev->resource[i].flags & flags)
+   bars |= (1 << i);
+
+   err = do_pci_enable_device(dev, bars);
+   if (err < 0)
+   atomic_dec(>enable_cnt);
+   return err;
+}
+
+/**
+ * pci_enable_device_io - Initialize a device for use with IO space
+ * @dev: PCI device to be initialized
+ *
+ *  Initialize device before it's used by a driver. Ask low-level code
+ *  to enable I/O resources. Wake up the device if it was suspended.
+ *  Beware, this function can fail.
+ */
+int pci_enable_device_io(struct pci_dev *dev)
+{
+   return __pci_enable_device_flags(dev, IORESOURCE_IO);
+}
+
+/**
+ * pci_enable_device_mem - Initialize a device for use with Memory space
+ * @dev: PCI device to be initialized
+ *
+ *  Initialize device before it's used by a driver. Ask low-level code
+ *  to enable Memory resources. Wake up the device if it was suspended.
+ *  Beware, this function can fail.
+ */
+int pci_enable_device_mem(struct pci_dev *dev)
+{
+   return __pci_enable_device_flags(dev, IORESOURCE_MEM);
+}
+
 /**
  * pci_enable_device - Initialize device before it's used by a driver.
  * @dev: PCI device to be initialized
@@ -777,7 +822,7 @@ pci_enable_device_bars(struct pci_dev *d
  */
 int pci_enable_device(struct pci_dev *dev)
 {
-   return pci_enable_device_bars(dev, (1 << PCI_NUM_RESOURCES) - 1);
+   return __pci_enable_device_flags(dev, IORESOURCE_MEM | IORESOURCE_IO);
 }
 
 /*
@@ -1651,6 +1696,8 @@ device_initcall(pci_init);
 
 EXPORT_SYMBOL(pci_reenable_device);
 EXPORT_SYMBOL(pci_enable_device_bars);
+EXPORT_SYMBOL(pci_enable_device_io);
+EXPORT_SYMBOL(pci_enable_device_mem);
 EXPORT_SYMBOL(pci_enable_device);
 EXPORT_SYMBOL(pcim_enable_device);
 EXPORT_SYMBOL(pcim_pin_device);
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -544,6 +544,8 @@ static inline int pci_write_config_dword
 
 int __must_check pci_enable_device(struct pci_dev *dev);
 int __must_check pci_enable_device_bars(struct pci_dev *dev, int mask);
+int __must_check pci_enable_device_io(struct pci_dev *dev);
+int __must_check pci_enable_device_mem(struct pci_dev *dev);
 int __must_check pci_reenable_device(struct pci_dev *);
 int __must_check pcim_enable_device(struct pci_dev *pdev);
 void pcim_pin_device(struct pci_dev *pdev);


Patches currently in gregkh-2.6 which might be from [EMAIL PROTECTED] are

bad/battery-class-driver.patch
driver/adb-convert-from-class_device-to-device.patch
driver/kobject-convert-hvc_console-to-use-kref-not-kobject.patch
driver/kobject-convert-hvcs-to-use-kref-not-kobject.patch
driver/kobject-convert-icom-to-use-kref-not-kobject.patch

patch pci-correctly-initialize-a-structure-for-pcie_save_pcix_state.patch added to gregkh-2.6 tree

2007-12-20 Thread gregkh

This is a note to let you know that I've just added the patch titled

 Subject: PCI: correctly initialize a structure for pcie_save_pcix_state()

to my gregkh-2.6 tree.  Its filename is

 pci-correctly-initialize-a-structure-for-pcie_save_pcix_state.patch

This tree can be found at 
http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/


>From [EMAIL PROTECTED] Mon Dec 17 18:02:37 2007
From: Shaohua Li <[EMAIL PROTECTED]>
Date: Tue, 18 Dec 2007 09:56:56 +0800
Subject: PCI: correctly initialize a structure for pcie_save_pcix_state()
To: lkml 
Cc: Andrew Morton <[EMAIL PROTECTED]>, Greg KH <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>


save_state->cap_nr should be correctly set, otherwise we can't find the
saved cap at resume.

Signed-off-by: Shaohua Li <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/pci/pci.c |2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -587,6 +587,7 @@ static int pci_save_pcie_state(struct pc
pci_read_config_word(dev, pos + PCI_EXP_LNKCTL, [i++]);
pci_read_config_word(dev, pos + PCI_EXP_SLTCTL, [i++]);
pci_read_config_word(dev, pos + PCI_EXP_RTCTL, [i++]);
+   save_state->cap_nr = PCI_CAP_ID_EXP;
pci_add_saved_cap(dev, save_state);
return 0;
 }
@@ -630,6 +631,7 @@ static int pci_save_pcix_state(struct pc
cap = (u16 *)_state->data[0];
 
pci_read_config_word(dev, pos + PCI_X_CMD, [i++]);
+   save_state->cap_nr = PCI_CAP_ID_PCIX;
pci_add_saved_cap(dev, save_state);
return 0;
 }


Patches currently in gregkh-2.6 which might be from [EMAIL PROTECTED] are

driver/kobject-change-drivers-cpuidle-sysfs.c-to-use-kobject_init_and_add.patch
pci/pcie-port-driver-correctly-detect-native-pme-feature.patch
pci/pcie-utilize-pcie-transaction-pending-bit.patch
pci/pci-add-pci-quirk-function-for-some-chipsets.patch
pci/pci-avoid-save-the-same-type-of-cap-multiple-times.patch
pci/pci-correctly-initialize-a-structure-for-pcie_save_pcix_state.patch
pci/pci-fix-typo-in-pci_save_pcix_state.patch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] dma_map_sg_ring() helper

2007-12-20 Thread Rusty Russell
On Friday 21 December 2007 11:40:00 David Miller wrote:
> From: Rusty Russell <[EMAIL PROTECTED]>
> Date: Fri, 21 Dec 2007 11:35:12 +1100
>
> > On Friday 21 December 2007 11:00:27 FUJITA Tomonori wrote:
> > > We need to pass the whole sg entries to the IOMMUs at a time.
> >
> > Hi Fujita,
> >
> > OK, it's certainly possible to have an arch override.  For which
> > architecture is this BTW?
>
> SPARC64, POWERPC, maybe IA-64 etc.
>
> Basically any platform that potentially does virtual
> remamping and thus linearization.

Fujita said "need" which confused me.  I already said it should be handed
down as an optimization; I was curious what I had broken :)

> I think it should always be provided, the new APIs give
> less information to the implementation and that's a step
> backwards.

Absolutely.  In fact, I think the sg_ring header would be made safer if it
had the "dma_num" in it as well: it's more explicit and less surprising to
the caller than mangling sg->num.

How are these two patches then?

===
Introduce sg_ring: a ring of scatterlist arrays.

This patch introduces 'struct sg_ring', a layer on top of scatterlist
arrays.  It meshes nicely with routines which expect a simple array of
'struct scatterlist' because it is easy to break down the ring into
its constituent arrays.

The sg_ring header also encodes the maximum number of entries, useful
for routines which populate an sg.  We need never hand around a number
of elements any more.

Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>
---
 include/linux/sg_ring.h |   74 
 1 files changed, 74 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/sgring.h

diff --git a/include/linux/sg_ring.h b/include/linux/sg_ring.h
new file mode 100644
--- /dev/null
+++ b/include/linux/sg_ring.h
@@ -0,0 +1,128 @@
+#ifndef _LINUX_SG_RING_H
+#define _LINUX_SG_RING_H
+#include 
+
+/**
+ * struct sg_ring - a ring of scatterlists
+ * @list: the list_head chaining them together
+ * @num: the number of valid sg entries
+ * @dma_num: the number of valid sg entries after dma mapping
+ * @max: the maximum number of sg entries (size of the sg array).
+ * @sg: the array of scatterlist entries.
+ *
+ * This provides a convenient encapsulation of one or more scatter gather
+ * arrays.  dma_map_sg_ring() (and friends) set @dma_num: some architectures
+ * coalesce sg entries, to this will be < num.
+ */
+struct sg_ring
+{
+   struct list_head list;
+   unsigned int num, dma_num, max;
+   struct scatterlist sg[0];
+};
+
+/* This helper declares an sg ring on the stack or in a struct. */
+#define DECLARE_SG_RING(name, max) \
+   struct {\
+   struct sg_ring ring;\
+   struct scatterlist sg[max]; \
+   } name
+
+/**
+ * sg_ring_init - initialize a scatterlist ring.
+ * @sg: the sg_ring.
+ * @max: the size of the trailing sg array.
+ *
+ * After initialization sg is alone in the ring.
+ */
+static inline void sg_ring_init(struct sg_ring *sg, unsigned int max)
+{
+#ifdef CONFIG_DEBUG_SG
+   unsigned int i;
+   for (i = 0; i < max; i++)
+   sg->sg[i].sg_magic = SG_MAGIC;
+   sg->num = 0x;
+   sg->dma_num = 0x;
+#endif
+   INIT_LIST_HEAD(>list);
+   sg->max = max;
+   /* FIXME: This is to clear the page bits. */
+   sg_init_table(sg->sg, sg->max);
+}
+
+/**
+ * sg_ring_single - initialize a one-element scatterlist ring.
+ * @sg: the sg_ring.
+ * @buf: the pointer to the buffer.
+ * @buflen: the length of the buffer.
+ *
+ * Does sg_ring_init and also sets up first (and only) sg element.
+ */
+static inline void sg_ring_single(struct sg_ring *sg,
+ const void *buf,
+ unsigned int buflen)
+{
+   sg_ring_init(sg, 1);
+   sg->num = 1;
+   sg_init_one(>sg[0], buf, buflen);
+}
+
+/**
+ * sg_ring_next - next array in a scatterlist ring.
+ * @sg: the sg_ring.
+ * @head: the sg_ring head.
+ *
+ * This will return NULL once @sg has looped back around to @head.
+ */
+static inline struct sg_ring *sg_ring_next(const struct sg_ring *sg,
+  const struct sg_ring *head)
+{
+   sg = list_first_entry(>list, struct sg_ring, list);
+   if (sg == head)
+   sg = NULL;
+   return (struct sg_ring *)sg;
+}
+
+/* Helper for writing for loops. */
+static inline struct sg_ring *sg_ring_iter(const struct sg_ring *head,
+  const struct sg_ring *sg,
+  unsigned int *i)
+{
+   (*i)++;
+   /* While loop lets us skip any zero-entry sg_ring arrays */
+   while (*i == sg->num) {
+   *i = 0;
+   sg = sg_ring_next(sg, head);
+   if (!sg)
+   break;
+   }
+   return (struct sg_ring *)sg;
+}
+
+/**
+ * sg_ring_for_each 

Re: Linux 2.6.24-rc6

2007-12-20 Thread Kyle McMartin
On Thu, Dec 20, 2007 at 05:41:09PM -0800, Linus Torvalds wrote:
> The regression list keeps shrinking, so we're still on track for a full 
> 2.6.24 release in early January. Assuming we don't all overeat during the 
> holidays and nobody gets any work done. But we all know that the holidays 
> are really the time when we get away from the boring "real work", and can 
> spend 24/7 on kernel hacking instead, right?
> 

The patch-2.6.24-rc6.bz2 doesn't seem to apply to a pristine
linux-2.6.23 tree? I see this while updating Fedora:

+ '[' '!' -f /home/kyle/rpms/kernel/devel/patch-2.6.24-rc6.bz2 ']'
+ case "$patch" in
+ bunzip2
+ patch -p1 -F1 -s
1 out of 3 hunks FAILED -- saving rejects to file
drivers/video/mbx/reg_bits.h.rej
error: Bad exit status from /var/tmp/rpm-tmp.22316 (%prep)

cheers, Kyle
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Robert Hancock wrote:

I have to wonder why certain system 
designers then didn't follow their strong recommendation..



I don't think I want to go there.

I used to be a hardware/firmware guy.
:D :D
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.24-rc6

2007-12-20 Thread Zhang, Yanmin
On Thu, 2007-12-20 at 17:41 -0800, Linus Torvalds wrote:
> The most noticeable part here (both to users and in the diffstat) should 
> be the libata-acpi fixes by Tejun Heo, which should hopefully take care of 
> all of the regressions that were caused by teaching SATA about doing the 
> proper ACPI stuff at bootup/suspend/resume/shutdown.
> 
> Other changes visible in the diffstat are a couple of new watchdog drivers 
> and the removal of the old tipar driver, and some Korean translations of 
> the kernel docs. And some V4L videobuf changes.
> 
> Other than that, it's pretty much a lot of small fixes (maybe not 
> one-liners, but we're talking "a few lines"). Networking, USB, scsi, 
> wireless, infiniband, IDE... With some alpha, ia64 and x86 arch updates.
> 
> The regression list keeps shrinking, so we're still on track for a full 
> 2.6.24 release in early January. Assuming we don't all overeat during the 
> holidays and nobody gets any work done. But we all know that the holidays 
> are really the time when we get away from the boring "real work", and can 
> spend 24/7 on kernel hacking instead, right?
> 
> Here's to a merry christmas, doing the whole druidic festival around the 
> tree thing,
When my automation testing system applied it to 2.6.23, below error stopped
the testing.

***
Hunk #3 FAILED at 534.
1 out of 3 hunks FAILED -- saving rejects to file 
drivers/video/mbx/reg_bits.h.rej
patching file drivers/video/mbx/regs.h

-yanmin


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] misc: Removal of final callers using fastcall

2007-12-20 Thread Andrew Morton
On Wed, 12 Dec 2007 15:38:26 -0800 Harvey Harrison <[EMAIL PROTECTED]> wrote:

> Andrew, I'm not sure who is best to hit with these final dribs and
> drabs removing fastcall.  Once all of these have hit Linus' tree
> I will send a final patch deleting the include/linux/linkage.h
> definitions as well as any remaining occurances.

Yes, that's a good approach, thanks.  Wait until the tree is fastcall-clean
and then kill the definition(s).

I think I skipped rather a lot of remove-fastcall patches because a)
suitable maintainers were cc'ed and b) I was going through a
suicidal-over-bug-reports phase.

Please keep them coming - I've always disliked fastcall.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Jan Beulich] [PATCH] constify tables in kernel/sysctl_check.c

2007-12-20 Thread Dave Jones
On Thu, Dec 20, 2007 at 04:14:05PM -0700, Eric W. Biederman wrote:

 > Remains the question whether it is intended that many, perhaps even
 > large, tables are compiled in without ever having a chance to get used,
 > i.e. whether there shouldn't #ifdef CONFIG_xxx get added.

 > -static struct trans_ctl_table trans_net_ax25_param_table[] = {
 > +static const struct trans_ctl_table trans_net_ax25_table[] = {

we lost the _param, which will cause a duplicate definition with ..
 
 > -static struct trans_ctl_table trans_net_ax25_table[] = {
 > +static const struct trans_ctl_table trans_net_ax25_table[] = {

cut-n-paste thinko ?

Dave

-- 
http://www.codemonkey.org.uk
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] scsi: Use new __dma_buffer to align sense buffer in scsi_cmnd

2007-12-20 Thread Benjamin Herrenschmidt
The sense buffer ins scsi_cmnd can nowadays be DMA'ed into directly
by some low level drivers (that typically happens with USB mass
storage).

This is a problem on non cache coherent architectures such as
embedded PowerPCs where the sense buffer can share cache lines with
other structure members, which leads to various forms of corruption.

This uses the newly defined __dma_buffer annotation to enforce that
on such platforms, the sense_buffer is contained within its own
cache line. This has no effect on cache coherent architectures.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 include/scsi/scsi_cmnd.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-merge.orig/include/scsi/scsi_cmnd.h   2007-12-21 13:07:14.0 
+1100
+++ linux-merge/include/scsi/scsi_cmnd.h2007-12-21 13:07:29.0 
+1100
@@ -88,7 +88,7 @@ struct scsi_cmnd {
   working on */
 
 #define SCSI_SENSE_BUFFERSIZE  96
-   unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE];
+   unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE] __dma_buffer;
/* obtained by REQUEST SENSE when
 * CHECK CONDITION is received on original
 * command (auto-sense) */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] DMA buffer alignment annotations

2007-12-20 Thread Benjamin Herrenschmidt
This patch based on some earlier work by Roland Dreier introduces
a pair of annotations that can be used to enforce alignment of
objects that can be DMA'ed into, and to enforce that an DMA'able
object within a structure isn't sharing a cache line with some
other object.

Such sharing of a data structure between DMA and non-DMA objects
isn't a recommended practice, but it does happen and in some case
might even make sense, so we now have a way to make it work
propertly.

The current patch only enables such alignment for some PowerPC
platforms that do not have coherent caches. Other platforms such
as ARM, MIPS, etc... can define ARCH_MIN_DMA_ALIGNMENT if they
want to benefit from this, I don't know them well enough to do
it myself.

The initial issue I'm fixing (in a second patch) by using these
is the SCSI sense buffer which is currently part of the scsi
command structure and can be DMA'ed to. On non-coherent platforms,
this causes various corruptions as this cache line is shared with
various other fields of the scsi_cmnd data structure.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 Documentation/DMA-mapping.txt |   32 
 include/asm-generic/page.h|   10 ++
 include/asm-powerpc/page.h|8 
 3 files changed, 50 insertions(+)

--- linux-merge.orig/include/asm-generic/page.h 2007-07-27 13:44:45.0 
+1000
+++ linux-merge/include/asm-generic/page.h  2007-12-21 13:07:28.0 
+1100
@@ -20,6 +20,16 @@ static __inline__ __attribute_const__ in
return order;
 }
 
+#ifndef ARCH_MIN_DMA_ALIGNMENT
+#define __dma_aligned
+#define __dma_buffer
+#else
+#define __dma_aligned  __attribute__((aligned(ARCH_MIN_DMA_ALIGNMENT)))
+#define __dma_buffer   __dma_buffer_line(__LINE__)
+#define __dma_buffer_line(line)__dma_aligned;\
+   char __dma_pad_##line[0] __dma_aligned
+#endif
+
 #endif /* __ASSEMBLY__ */
 #endif /* __KERNEL__ */
 
Index: linux-merge/include/asm-powerpc/page.h
===
--- linux-merge.orig/include/asm-powerpc/page.h 2007-09-28 11:42:10.0 
+1000
+++ linux-merge/include/asm-powerpc/page.h  2007-12-21 13:15:02.0 
+1100
@@ -77,6 +77,14 @@
 #define VM_DATA_DEFAULT_FLAGS64(VM_READ | VM_WRITE | \
 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
 
+/*
+ * On non cache coherent platforms, we enforce cache aligned DMA
+ * buffers inside of structures
+ */
+#ifdef CONFIG_NOT_COHERENT_CACHE
+#define ARCH_MIN_DMA_ALIGNMENT L1_CACHE_BYTES
+#endif
+
 #ifdef __powerpc64__
 #include 
 #else
Index: linux-merge/Documentation/DMA-mapping.txt
===
--- linux-merge.orig/Documentation/DMA-mapping.txt  2007-12-21 
13:17:14.0 +1100
+++ linux-merge/Documentation/DMA-mapping.txt   2007-12-21 13:20:00.0 
+1100
@@ -75,6 +75,38 @@ What about block I/O and networking buff
 networking subsystems make sure that the buffers they use are valid
 for you to DMA from/to.
 
+Note that on non-cache-coherent architectures, having a DMA buffer
+that shares a cache line with other data can lead to memory
+corruption.
+
+The __dma_buffer macro exists to allow safe DMA buffers to be declared
+easily and portably as part of larger structures without causing bloat
+on cache-coherent architectures. To get this macro, architectures have
+to define ARCH_MIN_DMA_ALIGNMENT to the requested alignment value in
+their asm/page.h before including asm-generic/page.h
+
+Of course these structures must be contained in memory that can be
+used for DMA as described above.
+
+To use __dma_buffer, just declare a struct like:
+
+   struct mydevice {
+   int field1;
+   char buffer[BUFFER_SIZE] __dma_buffer;
+   int field2;
+   };
+
+If this is used in code like:
+
+   struct mydevice *dev;
+   dev = kmalloc(sizeof *dev, GFP_KERNEL);
+
+then dev->buffer will be safe for DMA on all architectures.  On a
+cache-coherent architecture the members of dev will be aligned exactly
+as they would have been without __dma_buffer; on a non-cache-coherent
+architecture buffer and field2 will be aligned so that buffer does not
+share a cache line with any other data.
+
DMA addressing limitations
 
 Does your device have any DMA addressing limitations?  For example, is
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/5] sg_ring for scsi

2007-12-20 Thread FUJITA Tomonori
On Fri, 21 Dec 2007 10:13:38 +1100
Rusty Russell <[EMAIL PROTECTED]> wrote:

> On Thursday 20 December 2007 18:58:07 David Miller wrote:
> > From: Rusty Russell <[EMAIL PROTECTED]>
> > Date: Thu, 20 Dec 2007 18:53:48 +1100
> >
> > > Manipulating the magic chains is horrible; it looks simple to the
> > > places which simply want to iterate through it, but it's awful for
> > > code which wants to create them.
> >
> > I'm not saying complexity is inherent in this stuff, but
> > assuming that it is the complexity should live as far away
> > from the minions (the iterators in this case).  Therefore,
> > the creators is the right spot for the hard stuff.
> 
> In this case, the main benefit of the sg chaining was that the conversion of 
> most scsi drivers was easy (basically sg++ -> sg = sg_next(sg)).  The 
> conversion to sg_ring is more complex, but the end result is not 
> significantly more complex.
> 
> However, the cost to code which manipulates sg chains was significant: I 
> tried 
> using them in virtio and it was too ugly to live (so that doesn't support sg 
> chaining).  If this was the best we could do, that'd be fine.
> 
> But, as demonstrated, there are real benefits of having an explicit header:

I'm not sure about chaining the headers (as your sg_ring and
scsi_sgtable do) would simplify LLDs. Have you looked at ips or
qla1280?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PCI resource problems caused by improper address rounding

2007-12-20 Thread Benjamin Herrenschmidt

> So in your case, it should *result* in the exact same situation that your 
> patch did, but at the same time, when dealing with the (more common) case 
> of smaller allocations, we still continue to try to avoid being too close 
> to the top-of-memory.
> 
> So it's not perfect, but perhaps it is a good compromise between being 
> careful and having to make room?
> 
> Does this work for your case?

I'm not totally happy with changing the generic code like that, to
possibly not enforce "min" anymore. Other archs may have very good
reasons to provide a min value here... Though at the same time, at
least on powerpc, the parent resource of the host bridge will be the
real limit, so that may not be a big issue.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] dma_map_sg_ring() helper

2007-12-20 Thread FUJITA Tomonori
On Thu, 20 Dec 2007 16:40:00 -0800 (PST)
David Miller <[EMAIL PROTECTED]> wrote:

> From: Rusty Russell <[EMAIL PROTECTED]>
> Date: Fri, 21 Dec 2007 11:35:12 +1100
> 
> > On Friday 21 December 2007 11:00:27 FUJITA Tomonori wrote:
> > > We need to pass the whole sg entries to the IOMMUs at a time.
> > 
> > Hi Fujita,
> > 
> > OK, it's certainly possible to have an arch override.  For which 
> > architecture is this BTW?
> 
> SPARC64, POWERPC, maybe IA-64 etc.

And x86_64, Alpha, and PARISC.


> Basically any platform that potentially does virtual
> remamping and thus linearization.
> 
> I think it should always be provided, the new APIs give
> less information to the implementation and that's a step
> backwards.

Agreed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Robert Hancock

Tony Camuso wrote:

Robert Hancock wrote:

First off, I would like to see confirmation from the horses's mouths 
here (namely AMD, ServerWorks/Broadcom, and whoever else) that there 
is no other way to get around this problem than disabling MMCONFIG for 
accesses behind those chips.




I happen to have this one stored in my desktop.

 From AMD-8132TM HyperTransportTM
 PCI-X®2.0 Tunnel
  Revision Guide

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30801.pdf 



79 AMD-8132TM Tunnel Lacks Extended Configuration
Space Memory-Mapped I/O Base Address Register

Description

Current AMD processors do not natively support PCI-defined extended 
configuration space. A memory
mapped I/O base address register (MMIO BAR) is required in chipset 
devices to support extended

configuration space. The AMD-8132 does not have this MMIO BAR.
Potential Effect On System

The AMD-8132 is a PCI-X® Mode 2 capable device and requires the MMIO BAR 
to support extended
configuration space. Using a device which does have this MMIO BAR and an 
AMD-8132 on the same
HyperTransportTM link of the processor may cause firmware/software 
problems.


The base configuration space of the AMD-8132 and PCI(-X) devices 
attached to it are accessible using only
the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices 
attached to the AMD-8132 in the
extended configuration space are not accessible. The AMD-8132 has no 
registers in the extended

configuration space.

Suggested Workaround

It is strongly recommended that system designers do not connect the 
AMD-8132 and devices that use extended
configuration space MMIO BARs (ex: HyperTransport-to-PCI Express® 
bridges) to the same processor

HyperTransport link.

Fix Planned
No


That does sound fairly definitive. I have to wonder why certain system 
designers then didn't follow their strong recommendation..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Trailing periods in kernel messages

2007-12-20 Thread Andrew Morton
On Fri, 21 Dec 2007 02:43:33 +0100 Frans Pop <[EMAIL PROTECTED]> wrote:

> On Thursday 20 December 2007, Alan Cox wrote:
> > The kernel printk messages are sentences.
> 
> I'm afraid that I completely and utterly disagree. Kernel messages are _not_ 
> sentences. The vast majority is not well-formed and does not contain any of 
> the elements that are required for a proper sentence.
> 
> The most kernel messages can be compared to is a rather diverse and sloppy 
> enumeration. And enumerations follow completely different rules than 
> sentences. It can better be characterized as a "semi-random sequence of 
> context-sensitive technical messages".
> 
> IMHO the existing rule that "Kernel messages do not have to be terminated 
> with a period." is completely justified, though it does need some minor 
> clarification on the cases in which proper punctuation _should_ be 
> followed.

No-period is a kernel idiom, produces perfectly readable output, I have
never ever heard of anyone expressing the least concern over a lack of dots
at the end of their printks and 91% of kernel code agrees.

otoh the place where no-dots comes horridly unstuck is if a single printk
contains two sentences:

printk("My computer caught on fire.  I hope yours does too\n");

that's really daft.  It's very rare though.


Of course one could always patch syslogd to add the dots, or change printk
and add an i_am_anal=1 kernel boot option.


Andy, please have an accident with that checkpatch change and let's hope
like hell that nobody starts trying to "fix" any of this.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Robert Hancock wrote:

The case of the device built into the K8 northbridge that's unreachable 
by MMCONFIG kind of makes sense, since the northbridge is what's 
translating the MMCONFIG memory access into config accesses. It seems 
bizarre to me that a bridge chip could possibly have such a problem. The 
MMCONFIG access should get translated into a configuration space access 
in the northbridge and from that point on there's no difference between 
an MMCONFIG and type1 access.



Robert's point is well taken.

Only northbridge chips can give us this kind of trouble, and the only
chips mentioned in the present discussion as not being mmconf-compliant
are northbridges (8132, ht1000).

The patch is aware of this, so once a root bus has been programmed for
legacy pci config access, all descendent buses automatically inherit
this access mechanism and are therefore not probed by the patch.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli

On 12/20/2007 6:21 PM, Tony Camuso wrote:
>
> And the MMCONFIG problem with enterprise systems and workstations, where
> we do control the BIOS (for the most part), is due to known bugs in
> certain versions of certain chipsets, HT1000, AMD8132, among them, not
> the BIOS.



The lack of MMCONFIG support is indeed because some hypertransport
chipsets lack that support. But there are some BIOSes out there that are
advertising support for all busses in their MCFG acpi attribute (even
the busses managed by some amd8131 in a mixed nvidia-ck804/amd8131
motherboard), and the BIOS seems at least faulty for advertising a
capability that does not exist.


Loic

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: almost daily Kernel oops with 2.6.23.9 - and now 2.6.23.11 as well

2007-12-20 Thread Hemmann, Volker Armin
Ok, so after the holidays I will do the following:

let memtest86+ run several hours.
do a full backup to switch to r3 and build an unpatched kernel.
see if I can reproduce the oops with .21 and .22 (because AFAIR no oops with 
21.. but I might be wrong).

Not exactly in that order.

Glück Auf
Volker


ps: please cc me. I am not subscribed to lkml.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 08/24] Text Edit Lock - kprobes x86_32

2007-12-20 Thread Mathieu Desnoyers
Make kprobes use INIT_ARRAY().

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
Tested-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: H. Peter Anvin <[EMAIL PROTECTED]>
---
 arch/x86/kernel/kprobes_32.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6-lttng/arch/x86/kernel/kprobes_32.c
===
--- linux-2.6-lttng.orig/arch/x86/kernel/kprobes_32.c   2007-11-13 
09:45:35.0 -0500
+++ linux-2.6-lttng/arch/x86/kernel/kprobes_32.c2007-11-13 
09:45:44.0 -0500
@@ -176,12 +176,13 @@ int __kprobes arch_prepare_kprobe(struct
 
 void __kprobes arch_arm_kprobe(struct kprobe *p)
 {
-   text_poke(p->addr, ((unsigned char []){BREAKPOINT_INSTRUCTION}), 1);
+   text_poke(p->addr, INIT_ARRAY(unsigned char, BREAKPOINT_INSTRUCTION, 1),
+   1);
 }
 
 void __kprobes arch_disarm_kprobe(struct kprobe *p)
 {
-   text_poke(p->addr, >opcode, 1);
+   text_poke(p->addr, INIT_ARRAY(unsigned char, p->opcode, 1), 1);
 }
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bug 9182] Critical memory leak (dirty pages)

2007-12-20 Thread Nick Piggin
On Friday 21 December 2007 06:24, Linus Torvalds wrote:
> On Thu, 20 Dec 2007, Jan Kara wrote:
> >   As I wrote in my previous email, this solution works but hides the
> > fact that the page really *has* dirty data in it and *is* pinned in
> > memory until the commit code gets to writing it. So in theory it could
> > disturb the writeout logic by having more dirty data in memory than vm
> > thinks it has. Not that I'd have a better fix now but I wanted to point
> > out this problem.
>
> Well, I worry more about the VM being sane - and by the time we actually
> hit this case, as far as VM sanity is concerned, the page no longer really
> exists. It's been removed from the page cache, and it only really exists
> as any other random kernel allocation.

It does allow the VM to just not worry about this. However I don't
really like this kinds of catch-all conditions that are hard to get
rid of and can encourage bad behaviour.

It would be nice if the "insane" things were made to clean up after
themselves.


> The fact that low-level filesystems (in this case ext3 journaling) do
> their own insane things is not something the VM even _should_ care about.
> It's just an internal FS allocation, and the FS can do whatever the hell
> it wants with it, including doing IO etc.
>
> The kernel doesn't consider any other random IO pages to be "dirty" either
> (eg if you do direct-IO writes using low-level SCSI commands, the VM
> doesn't consider that to be any special dirty stuff, it's just random page
> allocations again). This is really no different.
>
> In other words: the Linux "VM" subsystem is really two differnt parts: the
> low-level page allocator (which obviously knows that the page is still in
> *use*, since it hasn't been free'd), and the higher-level file mapping and
> caching stuff that knows about things like page "dirtyiness". And once
> you've done a "remove_from_page_cache()", the higher levels are no longer
> involved, and dirty accounting simply doesn't get into the picture.

That's all true... it would simply be nice to ask the filesystems to do
this. But anyway I think your patch is pretty reasonable for the moment.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 14/24] Immediate Values - x86 Optimization

2007-12-20 Thread Mathieu Desnoyers
x86 optimization of the immediate values which uses a movl with code patching
to set/unset the value used to populate the register used as variable source.

Changelog:
- Use text_poke_early with cr0 WP save/restore to patch the bypass. We are doing
  non atomic writes to a code region only touched by us (nobody can execute it
  since we are protected by the imv_mutex).
- Put imv_set and _imv_set in the architecture independent header.
- Use $0 instead of %2 with (0) operand.
- Add x86_64 support, ready for i386+x86_64 -> x86 merge.
- Use asm-x86/asm.h.

Ok, so the most flexible solution that I see, that should fit for both
i386 and x86_64 would be :
1 byte  : "=Q" : Any register accessible as rh: a, b, c, and d.
2, 4 bytes : "=R" : Legacy register—the eight integer registers available
 on all i386 processors (a, b, c, d, si, di, bp, sp). 8
bytes : (only for x86_64)
  "=r" : A register operand is allowed provided that it is in a
 general register.
That should make sure x86_64 won't try to use REX prefixed opcodes for
1, 2 and 4 bytes values.

- Create the instruction in a discarded section to calculate its size. This is
  how we can align the beginning of the instruction on an address that will
  permit atomic modificatino of the immediate value without knowing the size of
  the opcode used by the compiler.
- Bugfix : 8 bytes 64 bits immediate value was declared as "4 bytes" in the
  immediate structure.
- Change the immediate.c update code to support variable length opcodes.

- Vastly simplified, using a busy looping IPI with interrupts disabled.
  Does not protect against NMI nor MCE.
- Pack the __imv section. Use smallest types required for size (char).
- Use imv_* instead of immediate_*.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>
CC: "H. Peter Anvin" <[EMAIL PROTECTED]>
CC: Chuck Ebbert <[EMAIL PROTECTED]>
CC: Christoph Hellwig <[EMAIL PROTECTED]>
CC: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: Rusty Russell <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig|1 
 include/asm-x86/immediate.h |   77 
 2 files changed, 78 insertions(+)

Index: linux-2.6-lttng/include/asm-x86/immediate.h
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6-lttng/include/asm-x86/immediate.h 2007-11-21 11:04:33.0 
-0500
@@ -0,0 +1,77 @@
+#ifndef _ASM_X86_IMMEDIATE_H
+#define _ASM_X86_IMMEDIATE_H
+
+/*
+ * Immediate values. x86 architecture optimizations.
+ *
+ * (C) Copyright 2006 Mathieu Desnoyers <[EMAIL PROTECTED]>
+ *
+ * This file is released under the GPLv2.
+ * See the file COPYING for more details.
+ */
+
+#include 
+
+/**
+ * imv_read - read immediate variable
+ * @name: immediate value name
+ *
+ * Reads the value of @name.
+ * Optimized version of the immediate.
+ * Do not use in __init and __exit functions. Use _imv_read() instead.
+ * If size is bigger than the architecture long size, fall back on a memory
+ * read.
+ *
+ * Make sure to populate the initial static 64 bits opcode with a value
+ * what will generate an instruction with 8 bytes immediate value (not the 
REX.W
+ * prefixed one that loads a sign extended 32 bits immediate value in a r64
+ * register).
+ */
+#define imv_read(name) \
+   ({  \
+   __typeof__(name##__imv) value;  \
+   BUILD_BUG_ON(sizeof(value) > 8);\
+   switch (sizeof(value)) {\
+   case 1: \
+   asm(".section __imv,\"a\",@progbits\n\t"\
+   _ASM_PTR "%c1, (3f)-%c2\n\t"\
+   ".byte %c2\n\t" \
+   ".previous\n\t" \
+   "mov $0,%0\n\t" \
+   "3:\n\t"\
+   : "=q" (value)  \
+   : "i" (##__imv),   \
+ "i" (sizeof(value))); \
+   break;  \
+   case 2: \
+   case 4: \
+   asm(".section __imv,\"a\",@progbits\n\t"\
+   _ASM_PTR "%c1, (3f)-%c2\n\t"\
+   ".byte %c2\n\t" \
+

[patch 10/24] Text Edit Lock - x86_32 standardize debug rodata

2007-12-20 Thread Mathieu Desnoyers
Standardize DEBUG_RODATA, removing special cases for hotplug and kprobes.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: H. Peter Anvin <[EMAIL PROTECTED]>
---
 arch/x86/mm/init_32.c |   20 +++-
 1 file changed, 7 insertions(+), 13 deletions(-)

Index: linux-2.6-lttng/arch/x86/mm/init_32.c
===
--- linux-2.6-lttng.orig/arch/x86/mm/init_32.c  2007-11-13 09:25:29.0 
-0500
+++ linux-2.6-lttng/arch/x86/mm/init_32.c   2007-11-13 09:45:48.0 
-0500
@@ -784,28 +784,21 @@ static int noinline do_test_wp_bit(void)
 }
 
 #ifdef CONFIG_DEBUG_RODATA
-
 void mark_rodata_ro(void)
 {
unsigned long start = PFN_ALIGN(_text);
unsigned long size = PFN_ALIGN(_etext) - start;
 
-#ifndef CONFIG_KPROBES
-#ifdef CONFIG_HOTPLUG_CPU
-   /* It must still be possible to apply SMP alternatives. */
-   if (num_possible_cpus() <= 1)
-#endif
-   {
-   change_page_attr(virt_to_page(start),
-size >> PAGE_SHIFT, PAGE_KERNEL_RX);
-   printk("Write protecting the kernel text: %luk\n", size >> 10);
-   }
-#endif
+   change_page_attr(virt_to_page(start),
+   size >> PAGE_SHIFT, PAGE_KERNEL_RX);
+   printk(KERN_INFO "Write protecting the kernel text: %luk\n",
+   size >> 10);
+
start += size;
size = (unsigned long)__end_rodata - start;
change_page_attr(virt_to_page(start),
 size >> PAGE_SHIFT, PAGE_KERNEL_RO);
-   printk("Write protecting the kernel read-only data: %luk\n",
+   printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n",
   size >> 10);
 
/*
@@ -816,6 +809,7 @@ void mark_rodata_ro(void)
 */
global_flush_tlb();
 }
+
 #endif
 
 void free_init_pages(char *what, unsigned long begin, unsigned long end)

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 02/24] Kprobes - do not use kprobes mutex in arch code

2007-12-20 Thread Mathieu Desnoyers
Remove the kprobes mutex from kprobes.h, since it does not belong there. Also
remove all use of this mutex in the architecture specific code, replacing it by
a proper mutex lock/unlock in the architecture agnostic code.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
Acked-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
---
 arch/ia64/kernel/kprobes.c|2 --
 arch/powerpc/kernel/kprobes.c |2 --
 arch/s390/kernel/kprobes.c|2 --
 arch/x86/kernel/kprobes_32.c  |2 --
 arch/x86/kernel/kprobes_64.c  |2 --
 include/linux/kprobes.h   |2 --
 kernel/kprobes.c  |2 ++
 7 files changed, 2 insertions(+), 12 deletions(-)

Index: linux-2.6-lttng/include/linux/kprobes.h
===
--- linux-2.6-lttng.orig/include/linux/kprobes.h2007-12-10 
09:53:27.0 -0500
+++ linux-2.6-lttng/include/linux/kprobes.h 2007-12-12 18:10:34.0 
-0500
@@ -35,7 +35,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #ifdef CONFIG_KPROBES
 #include 
@@ -183,7 +182,6 @@ static inline void kretprobe_assert(stru
 }
 
 extern spinlock_t kretprobe_lock;
-extern struct mutex kprobe_mutex;
 extern int arch_prepare_kprobe(struct kprobe *p);
 extern void arch_arm_kprobe(struct kprobe *p);
 extern void arch_disarm_kprobe(struct kprobe *p);
Index: linux-2.6-lttng/arch/x86/kernel/kprobes_32.c
===
--- linux-2.6-lttng.orig/arch/x86/kernel/kprobes_32.c   2007-12-10 
09:53:27.0 -0500
+++ linux-2.6-lttng/arch/x86/kernel/kprobes_32.c2007-12-12 
18:10:34.0 -0500
@@ -186,9 +186,7 @@ void __kprobes arch_disarm_kprobe(struct
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)
 {
-   mutex_lock(_mutex);
free_insn_slot(p->ainsn.insn, (p->ainsn.boostable == 1));
-   mutex_unlock(_mutex);
 }
 
 static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb)
Index: linux-2.6-lttng/kernel/kprobes.c
===
--- linux-2.6-lttng.orig/kernel/kprobes.c   2007-12-12 18:10:32.0 
-0500
+++ linux-2.6-lttng/kernel/kprobes.c2007-12-12 18:10:34.0 -0500
@@ -644,7 +644,9 @@ valid_p:
list_del_rcu(>list);
kfree(old_p);
}
+   mutex_lock(_mutex);
arch_remove_kprobe(p);
+   mutex_unlock(_mutex);
} else {
mutex_lock(_mutex);
if (p->break_handler)
Index: linux-2.6-lttng/arch/ia64/kernel/kprobes.c
===
--- linux-2.6-lttng.orig/arch/ia64/kernel/kprobes.c 2007-12-12 
18:06:06.0 -0500
+++ linux-2.6-lttng/arch/ia64/kernel/kprobes.c  2007-12-12 18:10:34.0 
-0500
@@ -582,9 +582,7 @@ void __kprobes arch_disarm_kprobe(struct
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)
 {
-   mutex_lock(_mutex);
free_insn_slot(p->ainsn.insn, 0);
-   mutex_unlock(_mutex);
 }
 /*
  * We are resuming execution after a single step fault, so the pt_regs
Index: linux-2.6-lttng/arch/powerpc/kernel/kprobes.c
===
--- linux-2.6-lttng.orig/arch/powerpc/kernel/kprobes.c  2007-12-10 
09:53:27.0 -0500
+++ linux-2.6-lttng/arch/powerpc/kernel/kprobes.c   2007-12-12 
18:10:34.0 -0500
@@ -88,9 +88,7 @@ void __kprobes arch_disarm_kprobe(struct
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)
 {
-   mutex_lock(_mutex);
free_insn_slot(p->ainsn.insn, 0);
-   mutex_unlock(_mutex);
 }
 
 static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs 
*regs)
Index: linux-2.6-lttng/arch/s390/kernel/kprobes.c
===
--- linux-2.6-lttng.orig/arch/s390/kernel/kprobes.c 2007-12-10 
09:53:27.0 -0500
+++ linux-2.6-lttng/arch/s390/kernel/kprobes.c  2007-12-12 18:10:34.0 
-0500
@@ -220,9 +220,7 @@ void __kprobes arch_disarm_kprobe(struct
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)
 {
-   mutex_lock(_mutex);
free_insn_slot(p->ainsn.insn, 0);
-   mutex_unlock(_mutex);
 }
 
 static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs 
*regs)
Index: linux-2.6-lttng/arch/x86/kernel/kprobes_64.c
===
--- linux-2.6-lttng.orig/arch/x86/kernel/kprobes_64.c   2007-12-10 
09:53:27.0 -0500
+++ linux-2.6-lttng/arch/x86/kernel/kprobes_64.c2007-12-12 
18:10:34.0 -0500
@@ -225,9 +225,7 @@ void __kprobes arch_disarm_kprobe(struct
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)
 {
-   mutex_lock(_mutex);
free_insn_slot(p->ainsn.insn, 0);
-   mutex_unlock(_mutex);
 }
 
 static void 

[patch 12/24] Immediate Values - Architecture Independent Code

2007-12-20 Thread Mathieu Desnoyers
Immediate values are used as read mostly variables that are rarely updated. They
use code patching to modify the values inscribed in the instruction stream. It
provides a way to save precious cache lines that would otherwise have to be used
by these variables.

There is a generic _imv_read() version, which uses standard global
variables, and optimized per architecture imv_read() implementations,
which use a load immediate to remove a data cache hit. When the immediate values
functionnality is disabled in the kernel, it falls back to global variables.

It adds a new rodata section "__imv" to place the pointers to the enable
value. Immediate values activation functions sits in kernel/immediate.c.

Immediate values refer to the memory address of a previously declared integer.
This integer holds the information about the state of the immediate values
associated, and must be accessed through the API found in linux/immediate.h.

At module load time, each immediate value is checked to see if it must be
enabled. It would be the case if the variable they refer to is exported from
another module and already enabled.

In the early stages of start_kernel(), the immediate values are updated to
reflect the state of the variable they refer to.

* Why should this be merged *

It improves performances on heavy memory I/O workloads.

An interesting result shows the potential this infrastructure has by
showing the slowdown a simple system call such as getppid() suffers when it is
used under heavy user-space cache trashing:

Random walk L1 and L2 trashing surrounding a getppid() call:
(note: in this test, do_syscal_trace was taken at each system call, see
Documentation/immediate.txt in these patches for details)
- No memory pressure :   getppid() takes  1573 cycles
- With memory pressure : getppid() takes 15589 cycles

We therefore have a slowdown of 10 times just to get the kernel variables from
memory. Another test on the same architecture (Intel P4) measured the memory
latency to be 559 cycles. Therefore, each cache line removed from the hot path
would improve the syscall time of 3.5% in these conditions.

Changelog:

- section __imv is already SHF_ALLOC
- Because of the wonders of ELF, section 0 has sh_addr and sh_size 0.  So
  the if (immediateindex) is unnecessary here.
- Remove module_mutex usage: depend on functions implemented in module.c for
  that.
- Does not update tainted module's immediate values.
- remove imv_*_t types, add DECLARE_IMV() and DEFINE_IMV().
  - imv_read() becomes imv_read(var) because of this.
- Adding a new EXPORT_IMV_SYMBOL(_GPL).
- remove imv_if(). Should use if (unlikely(imv_read(var))) instead.
  - Wait until we have gcc support before we add the imv_if macro, since
its form may have to change.
- Dont't declare the __imv section in vmlinux.lds.h, just put the content
  in the rodata section.
- Simplify interface : remove imv_set_early, keep track of kernel boot
  status internally.
- Remove the ALIGN(8) before the __imv section. It is packed now.
- Uses an IPI busy-loop on each CPU with interrupts disabled as a simple,
  architecture agnostic, update mechanism.
- Use imv_* instead of immediate_*.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Rusty Russell <[EMAIL PROTECTED]>
---
 include/asm-generic/vmlinux.lds.h |3 
 include/linux/immediate.h |   94 +++
 include/linux/module.h|   16 +++
 init/main.c   |8 +
 kernel/Makefile   |1 
 kernel/immediate.c|  187 ++
 kernel/module.c   |   50 +-
 7 files changed, 358 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/linux/immediate.h
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6-lttng/include/linux/immediate.h   2007-11-28 09:32:04.0 
-0500
@@ -0,0 +1,94 @@
+#ifndef _LINUX_IMMEDIATE_H
+#define _LINUX_IMMEDIATE_H
+
+/*
+ * Immediate values, can be updated at runtime and save cache lines.
+ *
+ * (C) Copyright 2007 Mathieu Desnoyers <[EMAIL PROTECTED]>
+ *
+ * This file is released under the GPLv2.
+ * See the file COPYING for more details.
+ */
+
+#ifdef CONFIG_IMMEDIATE
+
+struct __imv {
+   unsigned long var;  /* Pointer to the identifier variable of the
+* immediate value
+*/
+   unsigned long imv;  /*
+* Pointer to the memory location of the
+* immediate value within the instruction.
+*/
+   unsigned char size; /* Type size. */
+} __attribute__ ((packed));
+
+#include 
+
+/**
+ * imv_set - set immediate variable (with locking)
+ * @name: immediate value name
+ * @i: required value
+ *
+ * Sets the value of @name, taking the module_mutex if required by
+ * the architecture.
+ */
+#define 

[patch 07/24] Text Edit Lock - kprobes architecture independent support

2007-12-20 Thread Mathieu Desnoyers
Use the mutual exclusion provided by the text edit lock in the kprobes code. It
allows coherent manipulation of the kernel code by other subsystems.

Changelog:

Move the kernel_text_lock/unlock out of the for loops.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
Acked-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: Roel Kluin <[EMAIL PROTECTED]>
---
 kernel/kprobes.c |   19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

Index: linux-2.6-lttng/kernel/kprobes.c
===
--- linux-2.6-lttng.orig/kernel/kprobes.c   2007-11-16 13:40:06.0 
-0500
+++ linux-2.6-lttng/kernel/kprobes.c2007-11-17 10:00:23.0 -0500
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -568,9 +569,10 @@ static int __kprobes __register_kprobe(s
goto out;
}
 
+   kernel_text_lock();
ret = arch_prepare_kprobe(p);
if (ret)
-   goto out;
+   goto out_unlock_text;
 
INIT_HLIST_NODE(>hlist);
hlist_add_head_rcu(>hlist,
@@ -578,7 +580,8 @@ static int __kprobes __register_kprobe(s
 
if (kprobe_enabled)
arch_arm_kprobe(p);
-
+out_unlock_text:
+   kernel_text_unlock();
 out:
mutex_unlock(_mutex);
 
@@ -621,8 +624,11 @@ valid_p:
 * enabled - otherwise, the breakpoint would already have
 * been removed. We save on flushing icache.
 */
-   if (kprobe_enabled)
+   if (kprobe_enabled) {
+   kernel_text_lock();
arch_disarm_kprobe(p);
+   kernel_text_unlock();
+   }
hlist_del_rcu(_p->hlist);
cleanup_p = 1;
} else {
@@ -644,9 +650,7 @@ valid_p:
list_del_rcu(>list);
kfree(old_p);
}
-   mutex_lock(_mutex);
arch_remove_kprobe(p);
-   mutex_unlock(_mutex);
} else {
mutex_lock(_mutex);
if (p->break_handler)
@@ -717,7 +721,6 @@ static int __kprobes pre_handler_kretpro
ri->rp = rp;
ri->task = current;
arch_prepare_kretprobe(ri, regs);
-
/* XXX(hch): why is there no hlist_move_head? */
hlist_del(>uflist);
hlist_add_head(>uflist, >rp->used_instances);
@@ -938,11 +941,13 @@ static void __kprobes enable_all_kprobes
if (kprobe_enabled)
goto already_enabled;
 
+   kernel_text_lock();
for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
head = _table[i];
hlist_for_each_entry_rcu(p, node, head, hlist)
arch_arm_kprobe(p);
}
+   kernel_text_unlock();
 
kprobe_enabled = true;
printk(KERN_INFO "Kprobes globally enabled\n");
@@ -967,6 +972,7 @@ static void __kprobes disable_all_kprobe
 
kprobe_enabled = false;
printk(KERN_INFO "Kprobes globally disabled\n");
+   kernel_text_lock();
for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
head = _table[i];
hlist_for_each_entry_rcu(p, node, head, hlist) {
@@ -974,6 +980,7 @@ static void __kprobes disable_all_kprobe
arch_disarm_kprobe(p);
}
}
+   kernel_text_unlock();
 
mutex_unlock(_mutex);
/* Allow all currently running kprobes to complete */

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 16/24] Immediate Values - Powerpc Optimization

2007-12-20 Thread Mathieu Desnoyers
PowerPC optimization of the immediate values which uses a li instruction,
patched with an immediate value.

Changelog:
- Put imv_set and _imv_set in the architecture independent header.
- Pack the __imv section. Use smallest types required for size (char).
- Remove architecture specific update code : now handled by architecture
  agnostic code.
- Use imv_* instead of immediate_*.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Rusty Russell <[EMAIL PROTECTED]>
CC: Christoph Hellwig <[EMAIL PROTECTED]>
CC: Paul Mackerras <[EMAIL PROTECTED]>
---
 arch/powerpc/Kconfig|1 
 include/asm-powerpc/immediate.h |   55 
 2 files changed, 56 insertions(+)

Index: linux-2.6-lttng/include/asm-powerpc/immediate.h
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6-lttng/include/asm-powerpc/immediate.h 2007-11-19 
12:26:16.0 -0500
@@ -0,0 +1,55 @@
+#ifndef _ASM_POWERPC_IMMEDIATE_H
+#define _ASM_POWERPC_IMMEDIATE_H
+
+/*
+ * Immediate values. PowerPC architecture optimizations.
+ *
+ * (C) Copyright 2006 Mathieu Desnoyers <[EMAIL PROTECTED]>
+ *
+ * This file is released under the GPLv2.
+ * See the file COPYING for more details.
+ */
+
+#include 
+
+/**
+ * imv_read - read immediate variable
+ * @name: immediate value name
+ *
+ * Reads the value of @name.
+ * Optimized version of the immediate.
+ * Do not use in __init and __exit functions. Use _imv_read() instead.
+ */
+#define imv_read(name) \
+   ({  \
+   __typeof__(name##__imv) value;  \
+   BUILD_BUG_ON(sizeof(value) > 8);\
+   switch (sizeof(value)) {\
+   case 1: \
+   asm(".section __imv,\"a\",@progbits\n\t"\
+   PPC_LONG "%c1, ((1f)-1)\n\t"\
+   ".byte 1\n\t"   \
+   ".previous\n\t" \
+   "li %0,0\n\t"   \
+   "1:\n\t"\
+   : "=r" (value)  \
+   : "i" (##__imv));  \
+   break;  \
+   case 2: \
+   asm(".section __imv,\"a\",@progbits\n\t"\
+   PPC_LONG "%c1, ((1f)-2)\n\t"\
+   ".byte 2\n\t"   \
+   ".previous\n\t" \
+   "li %0,0\n\t"   \
+   "1:\n\t"\
+   : "=r" (value)  \
+   : "i" (##__imv));  \
+   break;  \
+   case 4: \
+   case 8: value = name##__imv;\
+   break;  \
+   };  \
+   value;  \
+   })
+
+#endif /* _ASM_POWERPC_IMMEDIATE_H */
Index: linux-2.6-lttng/arch/powerpc/Kconfig
===
--- linux-2.6-lttng.orig/arch/powerpc/Kconfig   2007-11-19 12:25:21.0 
-0500
+++ linux-2.6-lttng/arch/powerpc/Kconfig2007-11-19 12:26:01.0 
-0500
@@ -81,6 +81,7 @@ config PPC
default y
select HAVE_OPROFILE
select HAVE_KPROBES
+   select HAVE_IMMEDIATE
 
 config EARLY_PRINTK
bool

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 17/24] Immediate Values - Documentation

2007-12-20 Thread Mathieu Desnoyers
Changelog:
- Remove imv_set_early (removed from API).
- Use imv_* instead of immediate_*.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Rusty Russell <[EMAIL PROTECTED]>
---
 Documentation/immediate.txt |  221 
 1 file changed, 221 insertions(+)

Index: linux-2.6-lttng/Documentation/immediate.txt
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6-lttng/Documentation/immediate.txt 2007-11-03 20:28:58.0 
-0400
@@ -0,0 +1,221 @@
+   Using the Immediate Values
+
+   Mathieu Desnoyers
+
+
+This document introduces Immediate Values and their use.
+
+
+* Purpose of immediate values
+
+An immediate value is used to compile into the kernel variables that sit within
+the instruction stream. They are meant to be rarely updated but read often.
+Using immediate values for these variables will save cache lines.
+
+This infrastructure is specialized in supporting dynamic patching of the values
+in the instruction stream when multiple CPUs are running without disturbing the
+normal system behavior.
+
+Compiling code meant to be rarely enabled at runtime can be done using
+if (unlikely(imv_read(var))) as condition surrounding the code. The
+smallest data type required for the test (an 8 bits char) is preferred, since
+some architectures, such as powerpc, only allow up to 16 bits immediate values.
+
+
+* Usage
+
+In order to use the "immediate" macros, you should include linux/immediate.h.
+
+#include 
+
+DEFINE_IMV(char, this_immediate);
+EXPORT_IMV_SYMBOL(this_immediate);
+
+
+And use, in the body of a function:
+
+Use imv_set(this_immediate) to set the immediate value.
+
+Use imv_read(this_immediate) to read the immediate value.
+
+The immediate mechanism supports inserting multiple instances of the same
+immediate. Immediate values can be put in inline functions, inlined static
+functions, and unrolled loops.
+
+If you have to read the immediate values from a function declared as __init or
+__exit, you should explicitly use _imv_read(), which will fall back on a
+global variable read. Failing to do so will leave a reference to the __init
+section after it is freed (it would generate a modpost warning).
+
+You can choose to set an initial static value to the immediate by using, for
+instance:
+
+DEFINE_IMV(long, myptr) = 10;
+
+
+* Optimization for a given architecture
+
+One can implement optimized immediate values for a given architecture by
+replacing asm-$ARCH/immediate.h.
+
+
+* Performance improvement
+
+
+  * Memory hit for a data-based branch
+
+Here are the results on a 3GHz Pentium 4:
+
+number of tests: 100
+number of branches per test: 10
+memory hit cycles per iteration (mean): 636.611
+L1 cache hit cycles per iteration (mean): 89.6413
+instruction stream based test, cycles per iteration (mean): 85.3438
+Just getting the pointer from a modulo on a pseudo-random value, doing
+  nothing with it, cycles per iteration (mean): 77.5044
+
+So:
+Base case:  77.50 cycles
+instruction stream based test:  +7.8394 cycles
+L1 cache hit based test:+12.1369 cycles
+Memory load based test: +559.1066 cycles
+
+So let's say we have a ping flood coming at
+(14014 packets transmitted, 14014 received, 0% packet loss, time 1826ms)
+7674 packets per second. If we put 2 markers for irq entry/exit, it
+brings us to 15348 markers sites executed per second.
+
+(15348 exec/s) * (559 cycles/exec) / (3G cycles/s) = 0.0029
+We therefore have a 0.29% slowdown just on this case.
+
+Compared to this, the instruction stream based test will cause a
+slowdown of:
+
+(15348 exec/s) * (7.84 cycles/exec) / (3G cycles/s) = 0.4
+For a 0.004% slowdown.
+
+If we plan to use this for memory allocation, spinlock, and all sorts of
+very high event rate tracing, we can assume it will execute 10 to 100
+times more sites per second, which brings us to 0.4% slowdown with the
+instruction stream based test compared to 29% slowdown with the memory
+load based test on a system with high memory pressure.
+
+
+
+  * Markers impact under heavy memory load
+
+Running a kernel with my LTTng instrumentation set, in a test that
+generates memory pressure (from userspace) by trashing L1 and L2 caches
+between calls to getppid() (note: syscall_trace is active and calls
+a marker upon syscall entry and syscall exit; markers are disarmed).
+This test is done in user-space, so there are some delays due to IRQs
+coming and to the scheduler. (UP 2.6.22-rc6-mm1 kernel, task with -20
+nice level)
+
+My first set of results: Linear cache trashing, turned out not to be
+very interesting, because it seems like the linearity of the memset on a
+full array is somehow detected and it does not "really" trash the
+caches.
+
+Now the most interesting result: Random walk L1 and L2 trashing
+surrounding a getppid() call.
+
+- Markers compiled out (but 

[patch 05/24] Text Edit Lock - Architecture Independent Code

2007-12-20 Thread Mathieu Desnoyers
This is an architecture independant synchronization around kernel text
modifications through use of a global mutex.

A mutex has been chosen so that kprobes, the main user of this, can sleep during
memory allocation between the memory read of the instructions it must replace
and the memory write of the breakpoint.

Other user of this interface: immediate values.

Paravirt and alternatives are always done when SMP is inactive, so there is no
need to use locks.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>
---
 include/linux/memory.h |7 +++
 mm/memory.c|   34 ++
 2 files changed, 41 insertions(+)

Index: linux-2.6-lttng/include/linux/memory.h
===
--- linux-2.6-lttng.orig/include/linux/memory.h 2007-11-07 11:11:26.0 
-0500
+++ linux-2.6-lttng/include/linux/memory.h  2007-11-07 11:13:48.0 
-0500
@@ -93,4 +93,11 @@ extern int memory_notify(unsigned long v
 #define hotplug_memory_notifier(fn, pri) do { } while (0)
 #endif
 
+/*
+ * Take and release the kernel text modification lock, used for code patching.
+ * Users of this lock can sleep.
+ */
+extern void kernel_text_lock(void);
+extern void kernel_text_unlock(void);
+
 #endif /* _LINUX_MEMORY_H_ */
Index: linux-2.6-lttng/mm/memory.c
===
--- linux-2.6-lttng.orig/mm/memory.c2007-11-07 11:12:33.0 -0500
+++ linux-2.6-lttng/mm/memory.c 2007-11-07 11:14:25.0 -0500
@@ -50,6 +50,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -84,6 +86,12 @@ EXPORT_SYMBOL(high_memory);
 
 int randomize_va_space __read_mostly = 1;
 
+/*
+ * mutex protecting text section modification (dynamic code patching).
+ * some users need to sleep (allocating memory...) while they hold this lock.
+ */
+static DEFINE_MUTEX(text_mutex);
+
 static int __init disable_randmaps(char *s)
 {
randomize_va_space = 0;
@@ -2748,3 +2756,29 @@ int access_process_vm(struct task_struct
 
return buf - old_buf;
 }
+
+/**
+ * kernel_text_lock -   Take the kernel text modification lock
+ *
+ * Insures mutual write exclusion of kernel and modules text live text
+ * modification. Should be used for code patching.
+ * Users of this lock can sleep.
+ */
+void __kprobes kernel_text_lock(void)
+{
+   mutex_lock(_mutex);
+}
+EXPORT_SYMBOL_GPL(kernel_text_lock);
+
+/**
+ * kernel_text_unlock   -   Release the kernel text modification lock
+ *
+ * Insures mutual write exclusion of kernel and modules text live text
+ * modification. Should be used for code patching.
+ * Users of this lock can sleep.
+ */
+void __kprobes kernel_text_unlock(void)
+{
+   mutex_unlock(_mutex);
+}
+EXPORT_SYMBOL_GPL(kernel_text_unlock);

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 03/24] Kprobes - declare kprobe_mutex static

2007-12-20 Thread Mathieu Desnoyers
Since it will not be used by other kernel objects, it makes sense to declare it
static.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
Acked-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
---
 kernel/kprobes.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6-lttng/kernel/kprobes.c
===
--- linux-2.6-lttng.orig/kernel/kprobes.c   2007-08-19 09:09:15.0 
-0400
+++ linux-2.6-lttng/kernel/kprobes.c2007-08-19 17:18:07.0 -0400
@@ -68,7 +68,7 @@ static struct hlist_head kretprobe_inst_
 /* NOTE: change this value only with kprobe_mutex held */
 static bool kprobe_enabled;
 
-DEFINE_MUTEX(kprobe_mutex);/* Protects kprobe_table */
+static DEFINE_MUTEX(kprobe_mutex); /* Protects kprobe_table */
 DEFINE_SPINLOCK(kretprobe_lock);   /* Protects kretprobe_inst_table */
 static DEFINE_PER_CPU(struct kprobe *, kprobe_instance) = NULL;
 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 06/24] Text Edit Lock - Alternative code for x86

2007-12-20 Thread Mathieu Desnoyers
Fix a memcpy that should be a text_poke (in apply_alternatives).

Use kernel_wp_save/kernel_wp_restore in text_poke to support DEBUG_RODATA
correctly and so the CPU HOTPLUG special case can be removed.

Add text_poke_early, for alternatives and paravirt boot-time and module load
time patching.

Notes:
- A macro is used instead of an inline function to deal with circular header
  include otherwise necessary for read_cr0 and preempt_disable/enable.

Changelog:

- Fix text_set and text_poke alignment check (mixed up bitwise and and or)
- Remove text_set
- Use the new macro INIT_ARRAY() to stop polluting the C files with ({ })
  brackets (which breaks some c parsers in editors).
- Export add_nops, so it can be used by others.
- Remove x86 test for "wp_works_ok", it will just be ignored by the architecture
  if not supported.
- Document text_poke_early.
- Remove clflush, since it breaks some VIA architectures and is not strictly
  necessary.
- Add kerneldoc to text_poke and text_poke_early.
- Remove arg cr0 from kernel_wp_save/restore. Change the macro name for
  kernel_wp_disable/enable.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: H. Peter Anvin <[EMAIL PROTECTED]>
---
 arch/x86/kernel/alternative.c|   56 ---
 include/asm-x86/alternative_32.h |   36 -
 include/asm-x86/alternative_64.h |   36 -
 3 files changed, 116 insertions(+), 12 deletions(-)

Index: linux-2.6-lttng/arch/x86/kernel/alternative.c
===
--- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c  2007-12-06 
10:08:58.0 -0500
+++ linux-2.6-lttng/arch/x86/kernel/alternative.c   2007-12-06 
10:08:58.0 -0500
@@ -173,7 +173,7 @@ static const unsigned char*const * find_
 #endif /* CONFIG_X86_64 */
 
 /* Use this to add nops to a buffer, then text_poke the whole buffer. */
-static void add_nops(void *insns, unsigned int len)
+void add_nops(void *insns, unsigned int len)
 {
const unsigned char *const *noptable = find_nop_table();
 
@@ -186,6 +186,7 @@ static void add_nops(void *insns, unsign
len -= noplen;
}
 }
+EXPORT_SYMBOL_GPL(add_nops);
 
 extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
 extern u8 *__smp_locks[], *__smp_locks_end[];
@@ -219,7 +220,7 @@ void apply_alternatives(struct alt_instr
memcpy(insnbuf, a->replacement, a->replacementlen);
add_nops(insnbuf + a->replacementlen,
 a->instrlen - a->replacementlen);
-   text_poke(instr, insnbuf, a->instrlen);
+   text_poke_early(instr, insnbuf, a->instrlen);
}
 }
 
@@ -234,7 +235,8 @@ static void alternatives_smp_lock(u8 **s
continue;
if (*ptr > text_end)
continue;
-   text_poke(*ptr, ((unsigned char []){0xf0}), 1); /* add lock 
prefix */
+   /* add lock prefix */
+   text_poke(*ptr, INIT_ARRAY(unsigned char, 0xf0, 1), 1);
};
 }
 
@@ -397,7 +399,7 @@ void apply_paravirt(struct paravirt_patc
 
/* Pad the rest with nops */
add_nops(insnbuf + used, p->len - used);
-   text_poke(p->instr, insnbuf, p->len);
+   text_poke_early(p->instr, insnbuf, p->len);
}
 }
 extern struct paravirt_patch_site __start_parainstructions[],
@@ -457,18 +459,52 @@ void __init alternative_instructions(voi
 #endif
 }
 
-/*
- * Warning:
+/**
+ * text_poke_early - Update instructions on a live kernel at boot time
+ * @addr: address to modify
+ * @opcode: source of the copy
+ * @len: length to copy
+ *
  * When you use this code to patch more than one byte of an instruction
  * you need to make sure that other CPUs cannot execute this code in parallel.
- * Also no thread must be currently preempted in the middle of these 
instructions.
- * And on the local CPU you need to be protected again NMI or MCE handlers
- * seeing an inconsistent instruction while you patch.
+ * Also no thread must be currently preempted in the middle of these
+ * instructions.  And on the local CPU you need to be protected again NMI or 
MCE
+ * handlers seeing an inconsistent instruction while you patch.
+ * Warning: read_cr0 is modified by paravirt, this is why we have _early
+ * versions. They are not in the __init section because they can be used at
+ * module load time.
  */
-void __kprobes text_poke(void *addr, unsigned char *opcode, int len)
+void *text_poke_early(void *addr, const void *opcode, size_t len)
 {
memcpy(addr, opcode, len);
sync_core();
/* Could also do a CLFLUSH here to speed up CPU recovery; but
   that causes hangs on some VIA CPUs. */
+   return addr;
 }
+
+/**
+ * text_poke - 

[patch 00/24] Markers use immediate values, for 2.6.24-rc5-mm1

2007-12-20 Thread Mathieu Desnoyers
Hi Andrew,

Here are the patches that would be interesting to queue for 2.6.25. As you
asked, the patchset applies to 2.6.24-rc5-mm1.

It includes those logical changes and applies in the following order.

Thanks,

Mathieu

#Text Edit Lock
kprobes-use-mutex-for-insn-pages.patch
kprobes-dont-use-kprobes-mutex-in-arch-code.patch
kprobes-declare-kprobes-mutex-static.patch
declare-array.patch
text-edit-lock-architecture-independent-code.patch
text-edit-lock-alternative-i386-and-x86_64.patch
text-edit-lock-kprobes-architecture-independent.patch
text-edit-lock-kprobes-i386.patch
text-edit-lock-kprobes-x86_64.patch
text-edit-lock-i386-standardize-debug-rodata.patch
text-edit-lock-x86_64-standardize-debug-rodata.patch
#
#Immediate Values
immediate-values-architecture-independent-code.patch
immediate-values-kconfig-embedded.patch
immediate-values-x86-optimization.patch
add-text-poke-to-powerpc.patch
immediate-values-powerpc-optimization.patch
immediate-values-documentation.patch
#
profiling-use-immediate-values.patch
#
#Markers use immediate values
immediate-values-move-kprobes-x86-restore-interrupt-to-kdebug-h.patch
add-discard-section-to-x86.patch
immediate-values-x86-optimization-nmi-mce-support.patch
immediate-values-powerpc-optimization-nmi-mce-support.patch
immediate-values-use-arch-nmi-mce-support.patch
linux-kernel-markers-immediate-values.patch

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 09/24] Text Edit Lock - kprobes x86_64

2007-12-20 Thread Mathieu Desnoyers
Make kprobes use INIT_ARRAY().

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
Tested-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: H. Peter Anvin <[EMAIL PROTECTED]>
---
 arch/x86/kernel/kprobes_64.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6-lttng/arch/x86/kernel/kprobes_64.c
===
--- linux-2.6-lttng.orig/arch/x86/kernel/kprobes_64.c   2007-11-13 
09:45:35.0 -0500
+++ linux-2.6-lttng/arch/x86/kernel/kprobes_64.c2007-11-13 
09:45:46.0 -0500
@@ -215,12 +215,13 @@ static void __kprobes arch_copy_kprobe(s
 
 void __kprobes arch_arm_kprobe(struct kprobe *p)
 {
-   text_poke(p->addr, ((unsigned char []){BREAKPOINT_INSTRUCTION}), 1);
+   text_poke(p->addr, INIT_ARRAY(unsigned char, BREAKPOINT_INSTRUCTION, 1),
+   1);
 }
 
 void __kprobes arch_disarm_kprobe(struct kprobe *p)
 {
-   text_poke(p->addr, >opcode, 1);
+   text_poke(p->addr, INIT_ARRAY(unsigned char, p->opcode, 1), 1);
 }
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 21/24] Immediate Values - x86 Optimization NMI and MCE support

2007-12-20 Thread Mathieu Desnoyers
x86 optimization of the immediate values which uses a movl with code patching
to set/unset the value used to populate the register used as variable source.
It uses a breakpoint to bypass the instruction being changed, which lessens the
interrupt latency of the operation and protects against NMIs and MCE.

Changelog:
- Use text_poke_early with cr0 WP save/restore to patch the bypass. We are doing
  non atomic writes to a code region only touched by us (nobody can execute it
  since we are protected by the imv_mutex).
- Add x86_64 support, ready for i386+x86_64 -> x86 merge.
- Use asm-x86/asm.h.
- Change the immediate.c update code to support variable length opcodes.
- Use imv_* instead of immediate_*.
- Use kernel_wp_disable/enable instead of save/restore.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>
CC: "H. Peter Anvin" <[EMAIL PROTECTED]>
CC: Chuck Ebbert <[EMAIL PROTECTED]>
CC: Christoph Hellwig <[EMAIL PROTECTED]>
CC: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
---
 arch/x86/kernel/Makefile_32 |1 
 arch/x86/kernel/Makefile_64 |1 
 arch/x86/kernel/immediate.c |  277 
 arch/x86/kernel/traps_32.c  |   10 -
 include/asm-x86/immediate.h |   42 +-
 5 files changed, 322 insertions(+), 9 deletions(-)

Index: linux-2.6-lttng/include/asm-x86/immediate.h
===
--- linux-2.6-lttng.orig/include/asm-x86/immediate.h2007-12-06 
09:41:58.0 -0500
+++ linux-2.6-lttng/include/asm-x86/immediate.h 2007-12-06 09:42:29.0 
-0500
@@ -12,6 +12,18 @@
 
 #include 
 
+struct __imv {
+   unsigned long var;  /* Pointer to the identifier variable of the
+* immediate value
+*/
+   unsigned long imv;  /*
+* Pointer to the memory location of the
+* immediate value within the instruction.
+*/
+   unsigned char size; /* Type size. */
+   unsigned char insn_size;/* Type size. */
+} __attribute__ ((packed));
+
 /**
  * imv_read - read immediate variable
  * @name: immediate value name
@@ -26,6 +38,11 @@
  * what will generate an instruction with 8 bytes immediate value (not the 
REX.W
  * prefixed one that loads a sign extended 32 bits immediate value in a r64
  * register).
+ *
+ * Create the instruction in a discarded section to calculate its size. This is
+ * how we can align the beginning of the instruction on an address that will
+ * permit atomic modification of the immediate value without knowing the size 
of
+ * the opcode used by the compiler. The operand size is known in advance.
  */
 #define imv_read(name) \
({  \
@@ -35,8 +52,9 @@
case 1: \
asm(".section __imv,\"a\",@progbits\n\t"\
_ASM_PTR "%c1, (3f)-%c2\n\t"\
-   ".byte %c2\n\t" \
+   ".byte %c2, (3f-2f)\n\t"\
".previous\n\t" \
+   "2:\n\t"\
"mov $0,%0\n\t" \
"3:\n\t"\
: "=q" (value)  \
@@ -45,10 +63,16 @@
break;  \
case 2: \
case 4: \
-   asm(".section __imv,\"a\",@progbits\n\t"\
+   asm(".section __discard,\"\",@progbits\n\t" \
+   "1:\n\t"\
+   "mov $0,%0\n\t" \
+   "2:\n\t"\
+   ".previous\n\t" \
+   ".section __imv,\"a\",@progbits\n\t"\
_ASM_PTR "%c1, (3f)-%c2\n\t"\
-   ".byte %c2\n\t" \
+   ".byte %c2, (2b-1b)\n\t"\
".previous\n\t" \
+   ".org . + ((-.-(2b-1b)) & (%c2-1)), 0x90\n\t" \
"mov $0,%0\n\t" \

[patch 20/24] Add __discard section to x86

2007-12-20 Thread Mathieu Desnoyers
Add a __discard sectionto the linker script. Code produced in this section will
not be put in the vmlinux file. This is useful when we have to calculate the
size of an instruction before actually declaring it (for alignment purposes for
instance). This is used by the immediate values.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
Acked-by: H. Peter Anvin <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>
CC: Chuck Ebbert <[EMAIL PROTECTED]>
CC: Christoph Hellwig <[EMAIL PROTECTED]>
CC: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
---
 arch/x86/kernel/vmlinux_32.lds.S |1 +
 arch/x86/kernel/vmlinux_64.lds.S |1 +
 2 files changed, 2 insertions(+)

Index: linux-2.6-lttng/arch/x86/kernel/vmlinux_32.lds.S
===
--- linux-2.6-lttng.orig/arch/x86/kernel/vmlinux_32.lds.S   2007-11-14 
14:10:43.0 -0500
+++ linux-2.6-lttng/arch/x86/kernel/vmlinux_32.lds.S2007-11-14 
14:11:32.0 -0500
@@ -205,6 +205,7 @@ SECTIONS
   /* Sections to be discarded */
   /DISCARD/ : {
*(.exitcall.exit)
+   *(__discard)
}
 
   STABS_DEBUG
Index: linux-2.6-lttng/arch/x86/kernel/vmlinux_64.lds.S
===
--- linux-2.6-lttng.orig/arch/x86/kernel/vmlinux_64.lds.S   2007-11-14 
14:10:46.0 -0500
+++ linux-2.6-lttng/arch/x86/kernel/vmlinux_64.lds.S2007-11-14 
14:11:48.0 -0500
@@ -227,6 +227,7 @@ SECTIONS
   /DISCARD/ : {
*(.exitcall.exit)
*(.eh_frame)
+   *(__discard)
}
 
   STABS_DEBUG

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 13/24] Immediate Values - Kconfig menu in EMBEDDED

2007-12-20 Thread Mathieu Desnoyers
Immediate values provide a way to use dynamic code patching to update variables
sitting within the instruction stream. It saves caches lines normally used by
static read mostly variables. Enable it by default, but let users disable it
through the EMBEDDED menu with the "Disable immediate values" submenu entry.

Note: Since I think that I really should let embedded systems developers using
RO memory the option to disable the immediate values, I choose to leave this
menu option there, in the EMBEDDED menu. Also, the "CONFIG_IMMEDIATE" makes
sense because we want to compile out all the immediate code when we decide not
to use optimized immediate values at all (it removes otherwise unused code).

Changelog:
- Change ARCH_SUPPORTS_IMMEDIATE for ARCH_HAS_IMMEDIATE

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Rusty Russell <[EMAIL PROTECTED]>
CC: Adrian Bunk <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>
CC: Alexey Dobriyan <[EMAIL PROTECTED]>
CC: Christoph Hellwig <[EMAIL PROTECTED]>
---
 init/Kconfig |   24 
 1 file changed, 24 insertions(+)

Index: linux-2.6-lttng/init/Kconfig
===
--- linux-2.6-lttng.orig/init/Kconfig   2007-12-05 20:53:19.0 -0500
+++ linux-2.6-lttng/init/Kconfig2007-12-05 20:53:35.0 -0500
@@ -435,6 +435,20 @@ config CC_OPTIMIZE_FOR_SIZE
 config SYSCTL
bool
 
+config IMMEDIATE
+   default y if !DISABLE_IMMEDIATE
+   depends on HAVE_IMMEDIATE
+   bool
+   help
+ Immediate values are used as read-mostly variables that are rarely
+ updated. They use code patching to modify the values inscribed in the
+ instruction stream. It provides a way to save precious cache lines
+ that would otherwise have to be used by these variables. They can be
+ disabled through the EMBEDDED menu.
+
+config HAVE_IMMEDIATE
+   def_bool n
+
 menuconfig EMBEDDED
bool "Configure standard kernel features (for small systems)"
help
@@ -670,6 +684,16 @@ config MARKERS
 
 source "arch/Kconfig"
 
+config DISABLE_IMMEDIATE
+   default y if EMBEDDED
+   bool "Disable immediate values" if EMBEDDED
+   depends on HAVE_IMMEDIATE
+   help
+ Disable code patching based immediate values for embedded systems. It
+ consumes slightly more memory and requires to modify the instruction
+ stream each time a variable is updated. Should really be disabled for
+ embedded systems with read-only text.
+
 endmenu# General setup
 
 config RT_MUTEXES

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 11/24] Text Edit Lock - x86_64 standardize debug rodata

2007-12-20 Thread Mathieu Desnoyers
Standardize DEBUG_RODATA, removing special cases for hotplug and kprobes.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: H. Peter Anvin <[EMAIL PROTECTED]>
---
 arch/x86_64/mm/init.c |   23 +--
 1 file changed, 5 insertions(+), 18 deletions(-)

Index: linux-2.6-lttng/arch/x86/mm/init_64.c
===
--- linux-2.6-lttng.orig/arch/x86/mm/init_64.c  2007-09-24 11:00:01.0 
-0400
+++ linux-2.6-lttng/arch/x86/mm/init_64.c   2007-09-24 11:00:02.0 
-0400
@@ -592,25 +592,11 @@ void free_initmem(void)
 
 void mark_rodata_ro(void)
 {
-   unsigned long start = (unsigned long)_stext, end;
+   unsigned long start = PFN_ALIGN(_stext);
+   unsigned long end = PFN_ALIGN(__end_rodata);
 
-#ifdef CONFIG_HOTPLUG_CPU
-   /* It must still be possible to apply SMP alternatives. */
-   if (num_possible_cpus() > 1)
-   start = (unsigned long)_etext;
-#endif
-
-#ifdef CONFIG_KPROBES
-   start = (unsigned long)__start_rodata;
-#endif
-   
-   end = (unsigned long)__end_rodata;
-   start = (start + PAGE_SIZE - 1) & PAGE_MASK;
-   end &= PAGE_MASK;
-   if (end <= start)
-   return;
-
-   change_page_attr_addr(start, (end - start) >> PAGE_SHIFT, 
PAGE_KERNEL_RO);
+   change_page_attr_addr(start, (end - start) >> PAGE_SHIFT,
+   PAGE_KERNEL_RO);
 
printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n",
   (end - start) >> 10);
@@ -623,6 +609,7 @@ void mark_rodata_ro(void)
 */
global_flush_tlb();
 }
+
 #endif
 
 #ifdef CONFIG_BLK_DEV_INITRD

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 01/24] Kprobes - use a mutex to protect the instruction pages list.

2007-12-20 Thread Mathieu Desnoyers
Protect the instruction pages list by a specific insn pages mutex, called in 
get_insn_slot() and free_insn_slot(). It makes sure that architectures that does
not need to call arch_remove_kprobe() does not take an unneeded kprobes mutex.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
Acked-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
---
 kernel/kprobes.c |   27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

Index: linux-2.6-lttng/kernel/kprobes.c
===
--- linux-2.6-lttng.orig/kernel/kprobes.c   2007-08-27 11:48:56.0 
-0400
+++ linux-2.6-lttng/kernel/kprobes.c2007-08-27 11:48:58.0 -0400
@@ -95,6 +95,10 @@ enum kprobe_slot_state {
SLOT_USED = 2,
 };
 
+/*
+ * Protects the kprobe_insn_pages list. Can nest into kprobe_mutex.
+ */
+static DEFINE_MUTEX(kprobe_insn_mutex);
 static struct hlist_head kprobe_insn_pages;
 static int kprobe_garbage_slots;
 static int collect_garbage_slots(void);
@@ -131,7 +135,9 @@ kprobe_opcode_t __kprobes *get_insn_slot
 {
struct kprobe_insn_page *kip;
struct hlist_node *pos;
+   kprobe_opcode_t *ret;
 
+   mutex_lock(_insn_mutex);
  retry:
hlist_for_each_entry(kip, pos, _insn_pages, hlist) {
if (kip->nused < INSNS_PER_PAGE) {
@@ -140,7 +146,8 @@ kprobe_opcode_t __kprobes *get_insn_slot
if (kip->slot_used[i] == SLOT_CLEAN) {
kip->slot_used[i] = SLOT_USED;
kip->nused++;
-   return kip->insns + (i * MAX_INSN_SIZE);
+   ret = kip->insns + (i * MAX_INSN_SIZE);
+   goto end;
}
}
/* Surprise!  No unused slots.  Fix kip->nused. */
@@ -154,8 +161,10 @@ kprobe_opcode_t __kprobes *get_insn_slot
}
/* All out of space.  Need to allocate a new page. Use slot 0. */
kip = kmalloc(sizeof(struct kprobe_insn_page), GFP_KERNEL);
-   if (!kip)
-   return NULL;
+   if (!kip) {
+   ret = NULL;
+   goto end;
+   }
 
/*
 * Use module_alloc so this page is within +/- 2GB of where the
@@ -165,7 +174,8 @@ kprobe_opcode_t __kprobes *get_insn_slot
kip->insns = module_alloc(PAGE_SIZE);
if (!kip->insns) {
kfree(kip);
-   return NULL;
+   ret = NULL;
+   goto end;
}
INIT_HLIST_NODE(>hlist);
hlist_add_head(>hlist, _insn_pages);
@@ -173,7 +183,10 @@ kprobe_opcode_t __kprobes *get_insn_slot
kip->slot_used[0] = SLOT_USED;
kip->nused = 1;
kip->ngarbage = 0;
-   return kip->insns;
+   ret = kip->insns;
+end:
+   mutex_unlock(_insn_mutex);
+   return ret;
 }
 
 /* Return 1 if all garbages are collected, otherwise 0. */
@@ -207,7 +220,7 @@ static int __kprobes collect_garbage_slo
struct kprobe_insn_page *kip;
struct hlist_node *pos, *next;
 
-   /* Ensure no-one is preepmted on the garbages */
+   /* Ensure no-one is preempted on the garbages */
if (check_safety() != 0)
return -EAGAIN;
 
@@ -231,6 +244,7 @@ void __kprobes free_insn_slot(kprobe_opc
struct kprobe_insn_page *kip;
struct hlist_node *pos;
 
+   mutex_lock(_insn_mutex);
hlist_for_each_entry(kip, pos, _insn_pages, hlist) {
if (kip->insns <= slot &&
slot < kip->insns + (INSNS_PER_PAGE * MAX_INSN_SIZE)) {
@@ -247,6 +261,7 @@ void __kprobes free_insn_slot(kprobe_opc
 
if (dirty && ++kprobe_garbage_slots > INSNS_PER_PAGE)
collect_garbage_slots();
+   mutex_unlock(_insn_mutex);
 }
 #endif
 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 19/24] Immediate Values - Move Kprobes x86 restore_interrupt to kdebug.h

2007-12-20 Thread Mathieu Desnoyers
Since the breakpoint handler is useful both to kprobes and immediate values, it
makes sense to make the required restore_interrupt() available through
asm-i386/kdebug.h.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
Acked-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>
CC: Christoph Hellwig <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: H. Peter Anvin <[EMAIL PROTECTED]>
---
 include/asm-x86/kdebug.h |   12 
 include/asm-x86/kprobes_32.h |9 -
 include/asm-x86/kprobes_64.h |9 -
 3 files changed, 12 insertions(+), 18 deletions(-)

Index: linux-2.6-lttng/include/asm-x86/kdebug.h
===
--- linux-2.6-lttng.orig/include/asm-x86/kdebug.h   2007-11-02 
15:01:53.0 -0400
+++ linux-2.6-lttng/include/asm-x86/kdebug.h2007-11-02 15:02:00.0 
-0400
@@ -3,6 +3,9 @@
 
 #include 
 
+#include 
+#include 
+
 struct pt_regs;
 
 /* Grossly misnamed. */
@@ -30,4 +33,13 @@ extern void dump_pagetable(unsigned long
 extern unsigned long oops_begin(void);
 extern void oops_end(unsigned long);
 
+/* trap3/1 are intr gates for kprobes.  So, restore the status of IF,
+ * if necessary, before executing the original int3/1 (trap) handler.
+ */
+static inline void restore_interrupts(struct pt_regs *regs)
+{
+   if (regs->eflags & IF_MASK)
+   local_irq_enable();
+}
+
 #endif
Index: linux-2.6-lttng/include/asm-x86/kprobes_32.h
===
--- linux-2.6-lttng.orig/include/asm-x86/kprobes_32.h   2007-11-02 
15:01:53.0 -0400
+++ linux-2.6-lttng/include/asm-x86/kprobes_32.h2007-11-02 
15:02:00.0 -0400
@@ -79,15 +79,6 @@ struct kprobe_ctlblk {
struct prev_kprobe prev_kprobe;
 };
 
-/* trap3/1 are intr gates for kprobes.  So, restore the status of IF,
- * if necessary, before executing the original int3/1 (trap) handler.
- */
-static inline void restore_interrupts(struct pt_regs *regs)
-{
-   if (regs->eflags & IF_MASK)
-   local_irq_enable();
-}
-
 extern int kprobe_exceptions_notify(struct notifier_block *self,
unsigned long val, void *data);
 extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
Index: linux-2.6-lttng/include/asm-x86/kprobes_64.h
===
--- linux-2.6-lttng.orig/include/asm-x86/kprobes_64.h   2007-11-02 
15:02:10.0 -0400
+++ linux-2.6-lttng/include/asm-x86/kprobes_64.h2007-11-02 
15:02:22.0 -0400
@@ -72,15 +72,6 @@ struct kprobe_ctlblk {
struct prev_kprobe prev_kprobe;
 };
 
-/* trap3/1 are intr gates for kprobes.  So, restore the status of IF,
- * if necessary, before executing the original int3/1 (trap) handler.
- */
-static inline void restore_interrupts(struct pt_regs *regs)
-{
-   if (regs->eflags & IF_MASK)
-   local_irq_enable();
-}
-
 extern int post_kprobe_handler(struct pt_regs *regs);
 extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
 extern int kprobe_handler(struct pt_regs *regs);

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 22/24] Immediate Values - Powerpc Optimization NMI MCE support

2007-12-20 Thread Mathieu Desnoyers
Use an atomic update for immediate values.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Rusty Russell <[EMAIL PROTECTED]>
CC: Christoph Hellwig <[EMAIL PROTECTED]>
CC: Paul Mackerras <[EMAIL PROTECTED]>
---
 arch/powerpc/kernel/Makefile|1 
 arch/powerpc/kernel/immediate.c |   73 
 include/asm-powerpc/immediate.h |   18 +
 3 files changed, 92 insertions(+)

Index: linux-2.6-lttng/arch/powerpc/kernel/immediate.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6-lttng/arch/powerpc/kernel/immediate.c 2007-12-20 
20:52:27.0 -0500
@@ -0,0 +1,73 @@
+/*
+ * Powerpc optimized immediate values enabling/disabling.
+ *
+ * Mathieu Desnoyers <[EMAIL PROTECTED]>
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define LI_OPCODE_LEN  2
+
+/**
+ * arch_imv_update - update one immediate value
+ * @imv: pointer of type const struct __imv to update
+ * @early: early boot (1), normal (0)
+ *
+ * Update one immediate value. Must be called with imv_mutex held.
+ */
+int arch_imv_update(const struct __imv *imv, int early)
+{
+#ifdef CONFIG_KPROBES
+   kprobe_opcode_t *insn;
+   /*
+* Fail if a kprobe has been set on this instruction.
+* (TODO: we could eventually do better and modify all the (possibly
+* nested) kprobes for this site if kprobes had an API for this.
+*/
+   switch (imv->size) {
+   case 1: /* The uint8_t points to the 3rd byte of the
+* instruction */
+   insn = (void *)(imv->imv - 1 - LI_OPCODE_LEN);
+   break;
+   case 2: insn = (void *)(imv->imv - LI_OPCODE_LEN);
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   if (unlikely(!early && *insn == BREAKPOINT_INSTRUCTION)) {
+   printk(KERN_WARNING "Immediate value in conflict with kprobe. "
+   "Variable at %p, "
+   "instruction at %p, size %lu\n",
+   (void *)imv->imv,
+   (void *)imv->var, imv->size);
+   return -EBUSY;
+   }
+#endif
+
+   /*
+* If the variable and the instruction have the same value, there is
+* nothing to do.
+*/
+   switch (imv->size) {
+   case 1: if (*(uint8_t *)imv->imv
+   == *(uint8_t *)imv->var)
+   return 0;
+   break;
+   case 2: if (*(uint16_t *)imv->imv
+   == *(uint16_t *)imv->var)
+   return 0;
+   break;
+   default:return -EINVAL;
+   }
+   memcpy((void *)imv->imv, (void *)imv->var,
+   imv->size);
+   flush_icache_range(imv->imv,
+   imv->imv + imv->size);
+   return 0;
+}
Index: linux-2.6-lttng/include/asm-powerpc/immediate.h
===
--- linux-2.6-lttng.orig/include/asm-powerpc/immediate.h2007-12-20 
20:52:20.0 -0500
+++ linux-2.6-lttng/include/asm-powerpc/immediate.h 2007-12-20 
20:52:27.0 -0500
@@ -12,6 +12,16 @@
 
 #include 
 
+struct __imv {
+   unsigned long var;  /* Identifier variable of the immediate value */
+   unsigned long imv;  /*
+* Pointer to the memory location that holds
+* the immediate value within the load immediate
+* instruction.
+*/
+   unsigned char size; /* Type size. */
+} __attribute__ ((packed));
+
 /**
  * imv_read - read immediate variable
  * @name: immediate value name
@@ -19,6 +29,11 @@
  * Reads the value of @name.
  * Optimized version of the immediate.
  * Do not use in __init and __exit functions. Use _imv_read() instead.
+ * Makes sure the 2 bytes update will be atomic by aligning the immediate
+ * value. Use a normal memory read for the 4 bytes immediate because there is 
no
+ * way to atomically update it without using a seqlock read side, which would
+ * cost more in term of total i-cache and d-cache space than a simple memory
+ * read.
  */
 #define imv_read(name) \
({  \
@@ -40,6 +55,7 @@
PPC_LONG "%c1, ((1f)-2)\n\t"\
".byte 2\n\t"   \
".previous\n\t" \
+   ".align 2\n\t"  \
"li %0,0\n\t"   \
"1:\n\t"\
: 

[patch 15/24] Add text_poke and sync_core to powerpc

2007-12-20 Thread Mathieu Desnoyers
- Needed on architectures where we must surround live instruction modification
  with "WP flag disable".
- Turns into a memcpy on powerpc since there is no WP flag activated for
  instruction pages (yet..).
- Add empty sync_core to powerpc so it can be used in architecture independent
  code.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Rusty Russell <[EMAIL PROTECTED]>
CC: Christoph Hellwig <[EMAIL PROTECTED]>
CC: Paul Mackerras <[EMAIL PROTECTED]>
---
 include/asm-powerpc/cacheflush.h |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-powerpc/cacheflush.h
===
--- linux-2.6-lttng.orig/include/asm-powerpc/cacheflush.h   2007-11-19 
12:05:50.0 -0500
+++ linux-2.6-lttng/include/asm-powerpc/cacheflush.h2007-11-19 
13:27:36.0 -0500
@@ -63,7 +63,9 @@ extern void flush_dcache_phys_range(unsi
 #define copy_from_user_page(vma, page, vaddr, dst, src, len) \
memcpy(dst, src, len)
 
-
+#define text_poke  memcpy
+#define text_poke_earlytext_poke
+#define sync_core()
 
 #ifdef CONFIG_DEBUG_PAGEALLOC
 /* internal debugging function */

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 18/24] Scheduler Profiling - Use Immediate Values

2007-12-20 Thread Mathieu Desnoyers
Use immediate values with lower d-cache hit in optimized version as a
condition for scheduler profiling call.

Changelog :
- Use imv_* instead of immediate_*.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 drivers/kvm/kvm_main.c  |3 ++-
 include/linux/profile.h |5 +++--
 kernel/profile.c|   22 +++---
 kernel/sched_fair.c |6 +-
 4 files changed, 17 insertions(+), 19 deletions(-)

Index: linux-2.6-lttng/kernel/profile.c
===
--- linux-2.6-lttng.orig/kernel/profile.c   2007-12-05 20:50:34.0 
-0500
+++ linux-2.6-lttng/kernel/profile.c2007-12-05 20:53:43.0 -0500
@@ -42,8 +42,8 @@ static int (*timer_hook)(struct pt_regs 
 static atomic_t *prof_buffer;
 static unsigned long prof_len, prof_shift;
 
-int prof_on __read_mostly;
-EXPORT_SYMBOL_GPL(prof_on);
+DEFINE_IMV(char, prof_on) __read_mostly;
+EXPORT_IMV_SYMBOL_GPL(prof_on);
 
 static cpumask_t prof_cpu_mask = CPU_MASK_ALL;
 #ifdef CONFIG_SMP
@@ -61,7 +61,7 @@ static int __init profile_setup(char * s
 
if (!strncmp(str, sleepstr, strlen(sleepstr))) {
 #ifdef CONFIG_SCHEDSTATS
-   prof_on = SLEEP_PROFILING;
+   imv_set(prof_on, SLEEP_PROFILING);
if (str[strlen(sleepstr)] == ',')
str += strlen(sleepstr) + 1;
if (get_option(, ))
@@ -74,7 +74,7 @@ static int __init profile_setup(char * s
"kernel sleep profiling requires CONFIG_SCHEDSTATS\n");
 #endif /* CONFIG_SCHEDSTATS */
} else if (!strncmp(str, schedstr, strlen(schedstr))) {
-   prof_on = SCHED_PROFILING;
+   imv_set(prof_on, SCHED_PROFILING);
if (str[strlen(schedstr)] == ',')
str += strlen(schedstr) + 1;
if (get_option(, ))
@@ -83,7 +83,7 @@ static int __init profile_setup(char * s
"kernel schedule profiling enabled (shift: %ld)\n",
prof_shift);
} else if (!strncmp(str, kvmstr, strlen(kvmstr))) {
-   prof_on = KVM_PROFILING;
+   imv_set(prof_on, KVM_PROFILING);
if (str[strlen(kvmstr)] == ',')
str += strlen(kvmstr) + 1;
if (get_option(, ))
@@ -93,7 +93,7 @@ static int __init profile_setup(char * s
prof_shift);
} else if (get_option(, )) {
prof_shift = par;
-   prof_on = CPU_PROFILING;
+   imv_set(prof_on, CPU_PROFILING);
printk(KERN_INFO "kernel profiling enabled (shift: %ld)\n",
prof_shift);
}
@@ -104,7 +104,7 @@ __setup("profile=", profile_setup);
 
 void __init profile_init(void)
 {
-   if (!prof_on) 
+   if (!_imv_read(prof_on))
return;
  
/* only text is profiled */
@@ -293,7 +293,7 @@ void profile_hits(int type, void *__pc, 
int i, j, cpu;
struct profile_hit *hits;
 
-   if (prof_on != type || !prof_buffer)
+   if (!prof_buffer)
return;
pc = min((pc - (unsigned long)_stext) >> prof_shift, prof_len - 1);
i = primary = (pc & (NR_PROFILE_GRP - 1)) << PROFILE_GRPSHIFT;
@@ -403,7 +403,7 @@ void profile_hits(int type, void *__pc, 
 {
unsigned long pc;
 
-   if (prof_on != type || !prof_buffer)
+   if (!prof_buffer)
return;
pc = ((unsigned long)__pc - (unsigned long)_stext) >> prof_shift;
atomic_add(nr_hits, _buffer[min(pc, prof_len - 1)]);
@@ -560,7 +560,7 @@ static int __init create_hash_tables(voi
}
return 0;
 out_cleanup:
-   prof_on = 0;
+   imv_set(prof_on, 0);
smp_mb();
on_each_cpu(profile_nop, NULL, 0, 1);
for_each_online_cpu(cpu) {
@@ -587,7 +587,7 @@ static int __init create_proc_profile(vo
 {
struct proc_dir_entry *entry;
 
-   if (!prof_on)
+   if (!_imv_read(prof_on))
return 0;
if (create_hash_tables())
return -1;
Index: linux-2.6-lttng/include/linux/profile.h
===
--- linux-2.6-lttng.orig/include/linux/profile.h2007-12-05 
20:50:34.0 -0500
+++ linux-2.6-lttng/include/linux/profile.h 2007-12-05 20:53:43.0 
-0500
@@ -7,10 +7,11 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
-extern int prof_on __read_mostly;
+DECLARE_IMV(char, prof_on) __read_mostly;
 
 #define CPU_PROFILING  1
 #define SCHED_PROFILING2
@@ -38,7 +39,7 @@ static inline void profile_hit(int type,
/*
 * Speedup for the common (no profiling enabled) case:
 */
-   if (unlikely(prof_on == type))
+   if (unlikely(imv_read(prof_on) == type))
profile_hits(type, ip, 1);
 }
 
Index: linux-2.6-lttng/drivers/kvm/kvm_main.c

[patch 24/24] Linux Kernel Markers - Use Immediate Values

2007-12-20 Thread Mathieu Desnoyers
Make markers use immediate values.

Changelog :
- Use imv_* instead of immediate_*.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 Documentation/markers.txt |   17 +
 include/linux/marker.h|   42 --
 kernel/marker.c   |8 ++--
 kernel/module.c   |1 +
 4 files changed, 52 insertions(+), 16 deletions(-)

Index: linux-2.6-lttng/include/linux/marker.h
===
--- linux-2.6-lttng.orig/include/linux/marker.h 2007-12-05 20:53:25.0 
-0500
+++ linux-2.6-lttng/include/linux/marker.h  2007-12-05 20:53:54.0 
-0500
@@ -12,6 +12,7 @@
  * See the file COPYING for more details.
  */
 
+#include 
 #include 
 
 struct module;
@@ -42,7 +43,7 @@ struct marker {
const char *format; /* Marker format string, describing the
 * variable argument list.
 */
-   char state; /* Marker state. */
+   DEFINE_IMV(char, state);/* Immediate value state. */
char ptype; /* probe type : 0 : single, 1 : multi */
void (*call)(const struct marker *mdata,/* Probe wrapper */
void *call_private, const char *fmt, ...);
@@ -53,13 +54,14 @@ struct marker {
 #ifdef CONFIG_MARKERS
 
 /*
+ * Generic marker flavor always available.
  * Note : the empty asm volatile with read constraint is used here instead of a
  * "used" attribute to fix a gcc 4.1.x bug.
  * Make sure the alignment of the structure in the __markers section will
  * not add unwanted padding between the beginning of the section and the
  * structure. Force alignment to the same alignment as the section start.
  */
-#define __trace_mark(name, call_private, format, args...)  \
+#define __trace_mark(generic, name, call_private, format, args...) \
do {\
static const char __mstrtab_##name[]\
__attribute__((section("__markers_strings")))   \
@@ -70,17 +72,23 @@ struct marker {
0, 0, marker_probe_cb,  \
{ __mark_empty_function, NULL}, NULL }; \
__mark_check_format(format, ## args);   \
-   if (unlikely(__mark_##name.state)) {\
-   (*__mark_##name.call)   \
-   (&__mark_##name, call_private,  \
-   format, ## args);   \
+   if (!generic) { \
+   if (unlikely(imv_read(__mark_##name.state)))\
+   (*__mark_##name.call)   \
+   (&__mark_##name, call_private,  \
+   format, ## args);   \
+   } else {\
+   if (unlikely(_imv_read(__mark_##name.state)))   \
+   (*__mark_##name.call)   \
+   (&__mark_##name, call_private,  \
+   format, ## args);   \
}   \
} while (0)
 
 extern void marker_update_probe_range(struct marker *begin,
struct marker *end);
 #else /* !CONFIG_MARKERS */
-#define __trace_mark(name, call_private, format, args...) \
+#define __trace_mark(generic, name, call_private, format, args...) \
__mark_check_format(format, ## args)
 static inline void marker_update_probe_range(struct marker *begin,
struct marker *end)
@@ -88,15 +96,29 @@ static inline void marker_update_probe_r
 #endif /* CONFIG_MARKERS */
 
 /**
- * trace_mark - Marker
+ * trace_mark - Marker using code patching
  * @name: marker name, not quoted.
  * @format: format string
  * @args...: variable argument list
  *
- * Places a marker.
+ * Places a marker using optimized code patching technique (imv_read())
+ * to be enabled.
  */
 #define trace_mark(name, format, args...) \
-   __trace_mark(name, NULL, format, ## args)
+   __trace_mark(0, name, NULL, format, ## args)
+
+/**
+ * _trace_mark - Marker using variable read
+ * @name: marker name, not quoted.
+ * @format: format string
+ * @args...: variable argument list
+ *
+ * Places a marker using a standard memory read (_imv_read()) to be
+ * enabled. Should be used for markers in __init and __exit functions and in
+ * lockdep code.
+ */
+#define _trace_mark(name, format, args...) \
+   __trace_mark(1, name, NULL, format, ## args)
 
 /**
  * MARK_NOARGS - Format string for a marker with no argument.
Index: 

[patch 04/24] Add INIT_ARRAY() to kernel.h

2007-12-20 Thread Mathieu Desnoyers
Add initialization of an array, which needs brackets that would pollute kernel
code, to kernel.h. It is used to declare arguments passed as function parameters
such as:
text_poke(addr, INIT_ARRAY(unsigned char, 0xf0, len), len);

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/linux/kernel.h |2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6-lttng/include/linux/kernel.h
===
--- linux-2.6-lttng.orig/include/linux/kernel.h 2007-11-13 09:25:29.0 
-0500
+++ linux-2.6-lttng/include/linux/kernel.h  2007-11-13 09:45:38.0 
-0500
@@ -421,4 +421,6 @@ struct sysinfo {
 #define NUMA_BUILD 0
 #endif
 
+#define INIT_ARRAY(type, val, len) ((type [len]) { [0 ... (len)-1] = (val) })
+
 #endif

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 23/24] Immediate Values Use Arch NMI and MCE Support

2007-12-20 Thread Mathieu Desnoyers
Remove the architecture agnostic code now replaced by architecture specific,
atomic instruction updates.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/linux/immediate.h |   11 
 kernel/immediate.c|  113 +-
 2 files changed, 4 insertions(+), 120 deletions(-)

Index: linux-2.6-lttng/kernel/immediate.c
===
--- linux-2.6-lttng.orig/kernel/immediate.c 2007-11-26 12:48:48.0 
-0500
+++ linux-2.6-lttng/kernel/immediate.c  2007-11-26 13:01:15.0 -0500
@@ -19,9 +19,6 @@
 #include 
 #include 
 #include 
-#include 
-
-#include 
 
 /*
  * Kernel ready to execute the SMP update that may depend on trap and ipi.
@@ -37,111 +34,6 @@ extern const struct __imv __stop___imv[]
  */
 static DEFINE_MUTEX(imv_mutex);
 
-static atomic_t wait_sync;
-
-struct ipi_loop_data {
-   long value;
-   const struct __imv *imv;
-} loop_data;
-
-static void ipi_busy_loop(void *arg)
-{
-   unsigned long flags;
-
-   local_irq_save(flags);
-   atomic_dec(_sync);
-   do {
-   /* Make sure the wait_sync gets re-read */
-   smp_mb();
-   } while (atomic_read(_sync) > loop_data.value);
-   atomic_dec(_sync);
-   do {
-   /* Make sure the wait_sync gets re-read */
-   smp_mb();
-   } while (atomic_read(_sync) > 0);
-   /*
-* Issuing a synchronizing instruction must be done on each CPU before
-* reenabling interrupts after modifying an instruction. Required by
-* Intel's errata.
-*/
-   sync_core();
-   flush_icache_range(loop_data.imv->imv,
-   loop_data.imv->imv + loop_data.imv->size);
-   local_irq_restore(flags);
-}
-
-/**
- * apply_imv_update - update one immediate value
- * @imv: pointer of type const struct __imv to update
- *
- * Update one immediate value. Must be called with imv_mutex held.
- * It makes sure all CPUs are not executing the modified code by having them
- * busy looping with interrupts disabled.
- * It does _not_ protect against NMI and MCE (could be a problem with Intel's
- * errata if we use immediate values in their code path).
- */
-static int apply_imv_update(const struct __imv *imv)
-{
-   unsigned long flags;
-   long online_cpus;
-
-   /*
-* If the variable and the instruction have the same value, there is
-* nothing to do.
-*/
-   switch (imv->size) {
-   case 1: if (*(uint8_t *)imv->imv
-   == *(uint8_t *)imv->var)
-   return 0;
-   break;
-   case 2: if (*(uint16_t *)imv->imv
-   == *(uint16_t *)imv->var)
-   return 0;
-   break;
-   case 4: if (*(uint32_t *)imv->imv
-   == *(uint32_t *)imv->var)
-   return 0;
-   break;
-   case 8: if (*(uint64_t *)imv->imv
-   == *(uint64_t *)imv->var)
-   return 0;
-   break;
-   default:return -EINVAL;
-   }
-
-   if (imv_early_boot_complete) {
-   kernel_text_lock();
-   lock_cpu_hotplug();
-   online_cpus = num_online_cpus();
-   atomic_set(_sync, 2 * online_cpus);
-   loop_data.value = online_cpus;
-   loop_data.imv = imv;
-   smp_call_function(ipi_busy_loop, NULL, 1, 0);
-   local_irq_save(flags);
-   atomic_dec(_sync);
-   do {
-   /* Make sure the wait_sync gets re-read */
-   smp_mb();
-   } while (atomic_read(_sync) > online_cpus);
-   text_poke((void *)imv->imv, (void *)imv->var,
-   imv->size);
-   /*
-* Make sure the modified instruction is seen by all CPUs before
-* we continue (visible to other CPUs and local interrupts).
-*/
-   wmb();
-   atomic_dec(_sync);
-   flush_icache_range(imv->imv,
-   imv->imv + imv->size);
-   local_irq_restore(flags);
-   unlock_cpu_hotplug();
-   kernel_text_unlock();
-   } else
-   text_poke_early((void *)imv->imv, (void *)imv->var,
-   imv->size);
-   return 0;
-}
-
 /**
  * imv_update_range - Update immediate values in a range
  * @begin: pointer to the beginning of the range
@@ -154,9 +46,12 @@ void imv_update_range(const struct __imv
 {
const struct __imv *iter;
int ret;
+
for (iter = begin; iter < end; iter++) {
mutex_lock(_mutex);
-   ret = apply_imv_update(iter);
+   kernel_text_lock();
+   ret = arch_imv_update(iter, 

Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-20 Thread Oren Laadan


Pavel Emelyanov wrote:
> Oren Laadan wrote:
>> Serge E. Hallyn wrote:
>>> Quoting Pavel Emelyanov ([EMAIL PROTECTED]):
 Oren Laadan wrote:
> Serge E. Hallyn wrote:
>> Quoting Oren Laadan ([EMAIL PROTECTED]):
>>> I hate to bring this again, but what if the admin in the container
>>> mounts an external file system (eg. nfs, usb, loop mount from a file,
>>> or via fuse), and that file system already has a device that we would
>>> like to ban inside that container ?
>> Miklos' user mount patches enforced that if !capable(CAP_MKNOD),
>> then mnt->mnt_flags |= MNT_NODEV.  So that's no problem.
> Yes, that works to disallow all device files from a mounted file system.
>
> But it's a black and white thing: either they are all banned or allowed;
> you can't have some devices allowed and others not, depending on type
> A scenario where this may be useful is, for instance, if we some apps in
> the container to execute withing a pre-made chroot (sub)tree within that
> container.
>
>> But that's been pulled out of -mm! ?  Crap.
>>
>>> Since anyway we will have to keep a white- (or black-) list of devices
>>> that are permitted in a container, and that list may change even change
>>> per container -- why not enforce the access control at the VFS layer ?
>>> It's safer in the long run.
>> By that you mean more along the lines of Pavel's patch than my whitelist
>> LSM, or you actually mean Tetsuo's filesystem (i assume you don't mean 
>> that
>> by 'vfs layer' :), or something different entirely?
> :)
>
> By 'vfs' I mean at open() time, and not at mount(), or mknod() time.
> Either yours or Pavel's; I tend to prefer not to use LSM as it may
> collide with future security modules.
 Oren, AFAIS you've seen my patches for device access controller, right?
>> If you mean this one:
>> http://openvz.org/pipermail/devel/2007-September/007647.html
>> then ack :)
> 
> Great! Thanks.
> 
 Maybe we can revisit the issue then and try to come to agreement on what
 kind of model and implementation we all want?
>>> That would be great, Pavel.  I do prefer your solution over my LSM, so
>>> if we can get an elegant block device control right in the vfs code that
>>> would be my preference.
>> I concur.
>>
>> So it seems to me that we are all in favor of the model where open()
>> of a device will consult a black/white-list. Also, we are all in favor
>> of a non-LSM implementation, Pavel's code being a good example.
> 
> Thank you, Oren and Serge! I will revisit this issue then, but
> I have a vacation the next week and, after this, we have a New
> Year and Christmas holidays in Russia. So I will be able to go
> on with it only after the 7th January :( Hope this is OK for you.
> 
> Besides, Andrew told that he would pay little attention to new
> features till the 2.6.24 release, so I'm afraid we won't have this 
> even in -mm in the nearest months :(

Sounds great !  (as for the delay, it wasn't the highest priority issue
to begin with, so no worries).

Ah.. coincidentally they are celebrated here, too, on the same time :D
Merry Christmas and Happy New Year !

Oren.

> 
> Thanks,
> Pavel
> 
>> Oren.
>>
>>> The only thing that makes me keep wanting to go back to an LSM is the
>>> fact that the code defining the whitelist seems out of place in the vfs.
>>> But I guess that's actually separated into a modular cgroup, with the
>>> actual enforcement built in at the vfs.  So that's really the best
>>> solution.
>>>
>>> -serge
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Trailing periods in kernel messages

2007-12-20 Thread Frans Pop
On Thursday 20 December 2007, Alan Cox wrote:
> The kernel printk messages are sentences.

I'm afraid that I completely and utterly disagree. Kernel messages are _not_ 
sentences. The vast majority is not well-formed and does not contain any of 
the elements that are required for a proper sentence.

The most kernel messages can be compared to is a rather diverse and sloppy 
enumeration. And enumerations follow completely different rules than 
sentences. It can better be characterized as a "semi-random sequence of 
context-sensitive technical messages".

IMHO the existing rule that "Kernel messages do not have to be terminated 
with a period." is completely justified, though it does need some minor 
clarification on the cases in which proper punctuation _should_ be 
followed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Robert Hancock wrote:

First off, I would like to see confirmation from the horses's mouths 
here (namely AMD, ServerWorks/Broadcom, and whoever else) that there 
is no other way to get around this problem than disabling MMCONFIG for 
accesses behind those chips.





And here are the excerpts from that page of the spec which are salient
to the present discussion:

--

The base configuration space of the AMD-8132 and PCI(-X) devices attached to it 
are accessible using only
the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices attached to 
the AMD-8132 in the
extended configuration space are not accessible. The AMD-8132 has no registers 
in the extended
configuration space.

Fix Planned
No


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >