2.6.20-rc1-mm1

2006-12-14 Thread Andrew Morton

Temporarily at

http://userweb.kernel.org/~akpm/2.6.20-rc1-mm1/

Will appear later at


ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc1/2.6.20-rc1-mm1/



- Added the avr32 devel tree as git-avr32.patch (Haavard Skinnemoen)

- Don't enable locking API self-tests on powerpc - it explodes in a
  spectacular fashion.




Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

  git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git 
tag v2.6.16-rc2-mm1
  git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
  mm-commits mailing list.

echo subscribe mm-commits | mail [EMAIL PROTECTED]

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
  most valuable if you can perform a bisection search to identify which patch
  introduced the bug.  Instructions for this process are at

http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

  But beware that this process takes some time (around ten rebuilds and
  reboots), so consider reporting the bug first and if we cannot immediately
  identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
  list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
  email Subject: in some manner to reflect the nature of the bug.  Some
  developers filter by Subject: when looking for messages to read.

- Semi-daily snapshots of the -mm lineup are uploaded to
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
  the mm-commits list.



Changes since 2.6.19-mm1:

 origin.patch
 git-acpi.patch
 git-alsa.patch
 git-avr32.patch
 git-cpufreq.patch
 git-drm.patch
 git-dvb.patch
 git-gfs2-nmw.patch
 git-ieee1394.patch
 git-infiniband.patch
 git-libata-all.patch
 git-lxdialog.patch
 git-mmc.patch
 git-mmc-fixup.patch
 git-mtd.patch
 git-ubi.patch
 git-netdev-all.patch
 git-ioat.patch
 git-ocfs2.patch
 git-pcmcia.patch
 git-chelsio.patch
 git-selinux.patch
 git-pciseg.patch
 git-s390.patch
 git-sh.patch
 git-sas.patch
 git-sparc64.patch
 git-qla3xxx.patch
 git-wireless.patch
 git-gccbug.patch

 git trees.

-x86-smp-export-smp_num_siblings-for-oprofile.patch
-tty-export-get_current_tty.patch
-ieee80211softmac-fix-errors-related-to-the-work_struct-changes.patch
-kvm-add-missing-include.patch
-kvm-put-kvm-in-a-new-virtualization-menu.patch
-kvm-clean-up-amd-svm-debug-registers-load-and-unload.patch
-kvm-replace-__x86_64__-with-config_x86_64.patch
-fix-more-workqueue-build-breakage-tps65010.patch
-another-build-fix-header-rearrangements-osk.patch
-uml-fix-net_kern-workqueue-abuse.patch
-isdn-gigaset-fix-possible-missing-wakeup.patch
-i2o_exec_exit-and-i2o_driver_exit-should-not-be-__exit.patch
-cpufreq-fix-bug-in-duplicate-freq-elimination-code-in-acpi-cpufreq.patch
-gregkh-driver-modules-state.patch
-gregkh-driver-driver-core-delete-virtual-directory-on-class_unregister.patch
-gregkh-driver-debugfs-inotify-create-mkdir-support.patch
-gregkh-driver-debugfs-coding-style-fixes.patch
-gregkh-driver-debugfs-file-directory-creation-error-handling.patch
-gregkh-driver-debugfs-more-file-directory-creation-error-handling.patch
-gregkh-driver-debugfs-file-directory-removal-fix.patch
-gregkh-driver-driver-core-platform_driver_probe-can-save-codespace-save-codespace.patch
-gregkh-driver-driver-core-make-platform_device_add_data-accept-a-const-pointer.patch
-gregkh-driver-driver-core-deprecate-pm_legacy-default-it-to-n.patch
-drm-fix-return-value-check.patch
-drm-handle-pci_enable_device-failure.patch
-jdelvare-i2c-i2c-documentation-typos.patch
-jdelvare-i2c-i2c-update-i2c-id-list.patch
-jdelvare-i2c-i2c-delete-ite-bus-driver.patch
-jdelvare-i2c-i2c-pnx-new-driver.patch
-jdelvare-i2c-i2c-ibm_iic-add_request_release_mem_region.patch
-jdelvare-i2c-i2c-nforce2-cleanup.patch
-jdelvare-i2c-i2c-lockdep-handle-recursive-locking.patch
-jdelvare-i2c-i2c-at91-new-bus-driver.patch
-jdelvare-i2c-i2c-dev-make-I2C_FUNCS-ioctl-faster.patch
-jdelvare-i2c-i2c-remove-extraneous-whitespace.patch
-jdelvare-i2c-i2c-core-use-__ATTR.patch
-jdelvare-i2c-i2c-i801-documentation-update.patch
-jdelvare-i2c-i2c-fix-broken-ds1337-initialization.patch
-jdelvare-i2c-i2c-versatile-new-arm-bus-driver.patch
-jdelvare-i2c-i2c-discard-del-bus-wrappers.patch
-jdelvare-i2c-i2c-i801-enable-PEC-on-ICH6.patch
-jdelvare-i2c-i2c-dev-fix-return-value-check.patch
-jdelvare-i2c-i2c-dev-merge-kfree.patch
-jdelvare-i2c-i2c-omap-prescaler-formula.patch
-jdelvare-hwmon-hwmon-f71805f-add-fanctl-1-prepare.patch
-jdelvare-hwmon-hwmon-f71805f-add-fanctl-2-manual-mode.patch
-jdelvare-hwmon-hwmon-f71805f-add-fanctl-3-pwm-freq.patch
-jdelvare-hwmon-hwmon-f71805f-add-fanctl-4-pwm-mode.patch
-jdelvare-hwmon-hwmon-f71805f-add-fanctl-5-speed-mode.patch

RE: 2.6.18.4: flush_workqueue calls mutex_lock in interrupt environment

2006-12-14 Thread Chen, Kenneth W
Chen, Kenneth wrote on Thursday, December 14, 2006 5:59 PM
  It seems utterly insane to have aio_complete() flush a workqueue. That
  function has to be called from a number of different environments,
  including non-sleep tolerant environments.
  
  For instance it means that directIO on NFS will now cause the rpciod
  workqueues to call flush_workqueue(aio_wq), thus slowing down all RPC
  activity.
 
 The bug appears to be somewhere else, somehow the ref count on ioctx is
 all messed up.
 
 In aio_complete, __put_ioctx() should not be invoked because ref count
 on ioctx is supposedly more than 2, aio_complete decrement it once and
 should return without invoking the free function.
 
 The real freeing ioctx should be coming from exit_aio() or io_destroy(),
 in which case both wait until no further pending AIO request via
 wait_for_all_aios().

Ah, I think I see the bug: it must be a race between io_destroy() and
aio_complete().  A possible scenario:

cpu0   cpu1
io_destroy aio_complete
  wait_for_all_aios {__aio_put_req
 ... ctx-reqs_active--;
 if (!ctx-reqs_active)
return;
  }
  ...
  put_ioctx(ioctx)

 put_ioctx(ctx);
bam! Bug trigger!

AIO finished on cpu1 and while in the middle of aio_complete, cpu0 starts
io_destroy sequence, sees no pending AIO, went ahead decrement the ref
count on ioctx.  At a later point in aio_complete, the put_ioctx decrement
last ref count and calls the ioctx freeing function and there it triggered
the bug warning.

A simple fix would be to access ctx-reqs_active inside ctx spin lock in 
wait_for_all_aios().  At the mean time, I would like to
remove ref counting
for each iocb because we already performing ref count using reqs_active. This
would also prevent similar buggy code in the future.


Signed-off-by: Ken Chen [EMAIL PROTECTED]

--- ./fs/aio.c.orig 2006-11-29 13:57:37.0 -0800
+++ ./fs/aio.c  2006-12-14 20:45:14.0 -0800
@@ -298,17 +298,23 @@ static void wait_for_all_aios(struct kio
struct task_struct *tsk = current;
DECLARE_WAITQUEUE(wait, tsk);
 
+   spin_lock_irq(ctx-ctx_lock);
if (!ctx-reqs_active)
-   return;
+   goto out;
 
add_wait_queue(ctx-wait, wait);
set_task_state(tsk, TASK_UNINTERRUPTIBLE);
while (ctx-reqs_active) {
+   spin_unlock_irq(ctx-ctx_lock);
schedule();
set_task_state(tsk, TASK_UNINTERRUPTIBLE);
+   spin_lock_irq(ctx-ctx_lock);
}
__set_task_state(tsk, TASK_RUNNING);
remove_wait_queue(ctx-wait, wait);
+
+out:
+   spin_unlock_irq(ctx-ctx_lock);
 }
 
 /* wait_on_sync_kiocb:
@@ -425,7 +431,6 @@ static struct kiocb fastcall *__aio_get_
ring = kmap_atomic(ctx-ring_info.ring_pages[0], KM_USER0);
if (ctx-reqs_active  aio_ring_avail(ctx-ring_info, ring)) {
list_add(req-ki_list, ctx-active_reqs);
-   get_ioctx(ctx);
ctx-reqs_active++;
okay = 1;
}
@@ -538,8 +543,6 @@ int fastcall aio_put_req(struct kiocb *r
spin_lock_irq(ctx-ctx_lock);
ret = __aio_put_req(ctx, req);
spin_unlock_irq(ctx-ctx_lock);
-   if (ret)
-   put_ioctx(ctx);
return ret;
 }
 
@@ -795,8 +798,7 @@ static int __aio_run_iocbs(struct kioctx
 */
iocb-ki_users++;   /* grab extra reference */
aio_run_iocb(iocb);
-   if (__aio_put_req(ctx, iocb))  /* drop extra ref */
-   put_ioctx(ctx);
+   __aio_put_req(ctx, iocb);
}
if (!list_empty(ctx-run_list))
return 1;
@@ -942,7 +944,6 @@ int fastcall aio_complete(struct kiocb *
struct io_event *event;
unsigned long   flags;
unsigned long   tail;
-   int ret;
 
/*
 * Special case handling for sync iocbs:
@@ -1011,18 +1012,12 @@ int fastcall aio_complete(struct kiocb *
pr_debug(%ld retries: %zd of %zd\n, iocb-ki_retried,
iocb-ki_nbytes - iocb-ki_left, iocb-ki_nbytes);
 put_rq:
-   /* everything turned out well, dispose of the aiocb. */
-   ret = __aio_put_req(ctx, iocb);
-
spin_unlock_irqrestore(ctx-ctx_lock, flags);
 
if (waitqueue_active(ctx-wait))
wake_up(ctx-wait);
 
-   if (ret)
-   put_ioctx(ctx);
-
-   return ret;
+   return aio_put_req(iocb);
 }
 
 /* aio_read_evt
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: realtime-preempt and arm

2006-12-14 Thread tike64
Steven Rostedt [EMAIL PROTECTED] wrote:
 So you got a big jitter using nanosleep???  If that's the case, could
 you post the times you got. I'll also boot a kernel with the latest
 -rt patch, without highres compiled, and see if I can reproduce the
 same on x86.

You're very kind! Here you go:

This is from Linux uclibc 2.6.14.2 #12 PREEMPT without -rt:

100 revs; min: 19888 max: 20386 avg: 20013
100 revs; min: 19724 max: 20296 avg: 20013
100 revs; min: 19920 max: 20322 avg: 20013
100 revs; min: 19840 max: 20323 avg: 20016
100 revs; min: 10276 max: 42789 avg: 21294
100 revs; min: 10466 max: 34080 avg: 21687
100 revs; min: 10249 max: 30594 avg: 21161
100 revs; min: 10962 max: 34421 avg: 21415
100 revs; min: 10437 max: 31338 avg: 20562
100 revs; min: 11660 max: 29751 avg: 21066
100 revs; min: 10457 max: 30612 avg: 21417
100 revs; min: 10270 max: 37828 avg: 21513

First four lines are with the system otherwise idle. Then I fired 'ls
-Rl /mnt/some/nfs/share' on a framebuffer console.

And the same on a Linux uclibc 2.6.18-rt6 #19 PREEMPT:

100 revs; min: 19847 max: 20242 avg: 20014
100 revs; min: 19685 max: 20332 avg: 20014
100 revs; min: 19652 max: 20374 avg: 20014
100 revs; min: 19622 max: 20399 avg: 20012
100 revs; min: 19736 max: 26612 avg: 20074
100 revs; min: 19478 max: 21199 avg: 20021
100 revs; min: 19569 max: 21093 avg: 20022
100 revs; min: 19582 max: 20460 avg: 20017
100 revs; min: 19723 max: 20410 avg: 20016
100 revs; min: 19459 max: 24565 avg: 20056
100 revs; min: 19610 max: 24257 avg: 20053
100 revs; min: 19376 max: 26848 avg: 20079
100 revs; min: 19445 max: 26522 avg: 20077
100 revs; min: 19510 max: 22349 avg: 20034
100 revs; min: 19562 max: 20334 avg: 20017

The one to be blamed the most seems to be FB. 'ls ...  /dev/null'
leads to less than 2ms slips.

I'm supposed to make a 10ms control loop, so I could live with a couple
of ms jitter. 7ms is rather high and I think it tells about some
problem which makes one wonder if even higher occasional slips are
possible.

I made my test code visible if you want to take a look: www dot
riihineva dot no-ip dot org uphill public uphill test-rt.c

--

tike



 

Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Abolishing the DMCA (was GPL only modules)

2006-12-14 Thread Willy Tarreau
On Thu, Dec 14, 2006 at 01:09:06PM -0800, Michael ODonald wrote:
 Linus Torvalds wrote:
  DMCA is bad because it puts technical limits over
  the rights expressly granted by copyright law.
 
 The best ways to get rich corporations on our side in fighting the
 DMCA is to use the DMCA to hurt their profits. Companies that rely on
 binary drivers would have several options:
 
 1) Lobby politicians to repeal the DMCA, thereby allowing the
 companies to *internally* circumvent Linux’s GPL-only
 pseudo-restriction all they want by simply changing the source code.
 
 2) Release the binary drivers as open source or use their economic
 clout to pressure the makers of the binary drivers.
 
 3) Use FOSS-friendly hardware.
 
 I’m sorry, but there’s currently no economic push for repealing the
 DMCA; the only people trying to abolish it are idealists who are
 easily out-bought by the media cartel. This is our only chance to put
 some corporate money muscle behind the otherwise doomed anti-DMCA
 movement.

4) make no effort to support Linux

You're not the center of the world, never forget it !

Willy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] acpi: Add a docked sysfs file to the dock driver.

2006-12-14 Thread Len Brown
On Thursday 14 December 2006 02:16, Holger Macht wrote:
 On Mon 11. Dec - 12:05:08, Kristen Carlson Accardi wrote:

  Ok - how is this?

 Looks good to me, thanks!

  Signed-off-by: Kristen Carlson Accardi [EMAIL PROTECTED]

 Signed-off-by: Holger Macht [EMAIL PROTECTED]

Applied.
thanks,
-Len

commit 8ea86e0ba7c9d16ae0f35cb0c4165194fa573f7a
Author: Kristen Carlson Accardi [EMAIL PROTECTED]
Date:   Mon Dec 11 12:05:08 2006 -0800

ACPI: dock: add uevent to indicate change in device status

Send a uevent to indicate a device change whenever we dock or
undock, so that userspace may now check the dock status via sysfs.

Signed-off-by: Kristen Carlson Accardi [EMAIL PROTECTED]
Signed-off-by: Holger Macht [EMAIL PROTECTED]
Signed-off-by: Len Brown [EMAIL PROTECTED]

diff --git a/drivers/acpi/dock.c b/drivers/acpi/dock.c
index 8c6828b..215f5b3 100644
--- a/drivers/acpi/dock.c
+++ b/drivers/acpi/dock.c
@@ -326,10 +326,12 @@ static void hotplug_dock_devices(struct dock_station 
*ds, u32 event)
 
 static void dock_event(struct dock_station *ds, u32 event, int num)
 {
+   struct device *dev = dock_device.dev;
/*
-* we don't do events until someone tells me that
-* they would like to have them.
+* Indicate that the status of the dock station has
+* changed.
 */
+   kobject_uevent(dev-kobj, KOBJ_CHANGE);
 }
 
 /**
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Asynchronous Crypto suppor for MPC8360E's Security Engine

2006-12-14 Thread n . balaji

Hi,
  I am working on MPC8360E Security Engine. I have ported the Openswan
2.4.5(IPSec --KLIPS) with OCF to MPC8360E's Security Engine (Talitos).
Encryption and Decryption is working. But when I check the performance of
Talitos with netio benchmark Tool, IPSec S/W Algorithms is giving more
bandwidth than Talitos.
  I do not know that why Talitos is giving less bandwidth and any probelm
in Openswan or OCF or Talitos driver or Talitos H/W. Please give your
suggestions and if you have any link related to Talitos, send to me.

  Linux kernel version is 2.6.11.

 I am not a member of the above mailing lists. Please send the mail to me.

-Thanks
 N.Balaji


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/6] SMP boot hook for paravirt

2006-12-14 Thread Zachary Amsden
Add VMI SMP boot hook.  We emulate a regular boot sequence and use the
same APIC IPI initiation, we just poke magic values to load into the CPU
state when the startup IPI is received, rather than having to jump through a
real mode trampoline.

This is all that was needed to get SMP to work.

Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
Subject: SMP boot hook for paravirt

diff -r acfb7a15715f arch/i386/kernel/paravirt.c
--- a/arch/i386/kernel/paravirt.c   Thu Dec 14 16:22:03 2006 -0800
+++ b/arch/i386/kernel/paravirt.c   Thu Dec 14 16:51:48 2006 -0800
@@ -572,5 +572,7 @@ struct paravirt_ops paravirt_ops = {
 
.irq_enable_sysexit = native_irq_enable_sysexit,
.iret = native_iret,
+
+   .startup_ipi_hook = (void *)native_nop,
 };
 EXPORT_SYMBOL(paravirt_ops);
diff -r acfb7a15715f arch/i386/kernel/smpboot.c
--- a/arch/i386/kernel/smpboot.cThu Dec 14 16:22:03 2006 -0800
+++ b/arch/i386/kernel/smpboot.cThu Dec 14 16:51:52 2006 -0800
@@ -831,6 +831,13 @@ wakeup_secondary_cpu(int phys_apicid, un
num_starts = 0;
 
/*
+* Paravirt / VMI wants a startup IPI hook here to set up the
+* target processor state.
+*/
+   startup_ipi_hook(phys_apicid, (unsigned long) start_secondary,
+(unsigned long) stack_start.esp);
+
+   /*
 * Run STARTUP IPI loop.
 */
Dprintk(#startup loops: %d.\n, num_starts);
diff -r acfb7a15715f include/asm-i386/paravirt.h
--- a/include/asm-i386/paravirt.h   Thu Dec 14 16:22:03 2006 -0800
+++ b/include/asm-i386/paravirt.h   Thu Dec 14 16:51:48 2006 -0800
@@ -151,6 +151,8 @@ struct paravirt_ops
/* These two are jmp to, not actually called. */
void (fastcall *irq_enable_sysexit)(void);
void (fastcall *iret)(void);
+
+   void (fastcall *startup_ipi_hook)(int phys_apicid, unsigned long 
start_eip, unsigned long start_esp);
 };
 
 /* Mark a paravirt probe function. */
@@ -323,6 +325,13 @@ static inline unsigned long apic_read(un
 }
 #endif
 
+#ifdef CONFIG_SMP
+static inline void startup_ipi_hook(int phys_apicid, unsigned long start_eip,
+   unsigned long start_esp)
+{
+   return paravirt_ops.startup_ipi_hook(phys_apicid, start_eip, start_esp);
+}
+#endif
 
 #define __flush_tlb() paravirt_ops.flush_tlb_user()
 #define __flush_tlb_global() paravirt_ops.flush_tlb_kernel()
diff -r acfb7a15715f include/asm-i386/smp.h
--- a/include/asm-i386/smp.hThu Dec 14 16:22:03 2006 -0800
+++ b/include/asm-i386/smp.hThu Dec 14 16:52:21 2006 -0800
@@ -52,6 +52,11 @@ extern void cpu_uninit(void);
 extern void cpu_uninit(void);
 #endif
 
+#ifndef CONFIG_PARAVIRT
+#define startup_ipi_hook(phys_apicid, start_eip, start_esp)\
+do { } while (0)
+#endif
+
 /*
  * This function is needed by all SMP systems. It must _always_ be valid
  * from the initial startup. We map APIC_BASE very early in page_setup(),
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/6] VMI timer patches

2006-12-14 Thread Zachary Amsden
VMI timer code.  It works by taking over the local APIC clock when APIC is
configured, which requires a couple hooks into the APIC code.  The backend
timer code could be commonized into the timer infrastructure, but there are
some pieces missing (stolen time, in particular), and the exact semantics
of when to do accounting for NO_IDLE need to be shared between different
hypervisors as well.  So for now, VMI timer is a separate module.

Subject: VMI timer patches
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]

diff -r 77e4058e936b arch/i386/Kconfig
--- a/arch/i386/Kconfig Thu Dec 14 16:40:14 2006 -0800
+++ b/arch/i386/Kconfig Thu Dec 14 16:40:16 2006 -0800
@@ -1227,3 +1227,12 @@ config KTIME_SCALAR
 config KTIME_SCALAR
bool
default y
+
+config NO_IDLE_HZ
+   bool
+   depends on PARAVIRT
+   default y
+   help
+ Switches the regular HZ timer off when the system is going idle.
+ This helps a hypervisor detect that the Linux system is idle,
+ reducing the overhead of idle systems.
diff -r 77e4058e936b arch/i386/kernel/Makefile
--- a/arch/i386/kernel/Makefile Thu Dec 14 16:40:14 2006 -0800
+++ b/arch/i386/kernel/Makefile Thu Dec 14 16:40:16 2006 -0800
@@ -40,7 +40,7 @@ obj-$(CONFIG_HPET_TIMER)  += hpet.o
 obj-$(CONFIG_HPET_TIMER)   += hpet.o
 obj-$(CONFIG_K8_NB)+= k8.o
 
-obj-$(CONFIG_VMI)  += vmi.o
+obj-$(CONFIG_VMI)  += vmi.o vmitime.o
 
 # Make sure this is linked after any other paravirt_ops structs: see head.S
 obj-$(CONFIG_PARAVIRT) += paravirt.o
diff -r 77e4058e936b arch/i386/kernel/apic.c
--- a/arch/i386/kernel/apic.c   Thu Dec 14 16:40:14 2006 -0800
+++ b/arch/i386/kernel/apic.c   Thu Dec 14 16:40:16 2006 -0800
@@ -1395,7 +1395,7 @@ int __init APIC_init_uniprocessor (void)
if (!skip_ioapic_setup  nr_ioapics)
setup_IO_APIC();
 #endif
-   setup_boot_APIC_clock();
+   setup_boot_clock();
 
return 0;
 }
diff -r 77e4058e936b arch/i386/kernel/entry.S
--- a/arch/i386/kernel/entry.S  Thu Dec 14 16:40:14 2006 -0800
+++ b/arch/i386/kernel/entry.S  Thu Dec 14 16:40:16 2006 -0800
@@ -622,6 +622,11 @@ ENTRY(name)\
 /* The include is where all of the SMP etc. interrupts come from */
 #include entry_arch.h
 
+/* This alternate entry is needed because we hijack the apic LVTT */
+#if defined(CONFIG_VMI)  defined(CONFIG_X86_LOCAL_APIC)
+BUILD_INTERRUPT(apic_vmi_timer_interrupt,LOCAL_TIMER_VECTOR)
+#endif
+
 KPROBE_ENTRY(page_fault)
RING0_EC_FRAME
pushl $do_page_fault
diff -r 77e4058e936b arch/i386/kernel/paravirt.c
--- a/arch/i386/kernel/paravirt.c   Thu Dec 14 16:40:14 2006 -0800
+++ b/arch/i386/kernel/paravirt.c   Thu Dec 14 16:40:16 2006 -0800
@@ -544,6 +544,8 @@ struct paravirt_ops paravirt_ops = {
.apic_write = native_apic_write,
.apic_write_atomic = native_apic_write_atomic,
.apic_read = native_apic_read,
+   .setup_boot_clock = setup_boot_APIC_clock,
+   .setup_secondary_clock = setup_secondary_APIC_clock,
 #endif
.set_lazy_mode = (void *)native_nop,
 
diff -r 77e4058e936b arch/i386/kernel/smpboot.c
--- a/arch/i386/kernel/smpboot.cThu Dec 14 16:40:14 2006 -0800
+++ b/arch/i386/kernel/smpboot.cThu Dec 14 16:40:16 2006 -0800
@@ -556,7 +556,7 @@ static void __devinit start_secondary(vo
smp_callin();
while (!cpu_isset(smp_processor_id(), smp_commenced_mask))
rep_nop();
-   setup_secondary_APIC_clock();
+   setup_secondary_clock();
if (nmi_watchdog == NMI_IO_APIC) {
disable_8259A_irq(0);
enable_NMI_through_LVT0(NULL);
@@ -1330,7 +1330,7 @@ static void __init smp_boot_cpus(unsigne
 
smpboot_setup_io_apic();
 
-   setup_boot_APIC_clock();
+   setup_boot_clock();
 
/*
 * Synchronize the TSC with the AP
diff -r 77e4058e936b arch/i386/kernel/time.c
--- a/arch/i386/kernel/time.c   Thu Dec 14 16:40:14 2006 -0800
+++ b/arch/i386/kernel/time.c   Thu Dec 14 16:40:16 2006 -0800
@@ -232,6 +232,7 @@ static void sync_cmos_clock(unsigned lon
 static void sync_cmos_clock(unsigned long dummy);
 
 static DEFINE_TIMER(sync_cmos_timer, sync_cmos_clock, 0, 0);
+int no_sync_cmos_clock;
 
 static void sync_cmos_clock(unsigned long dummy)
 {
@@ -275,7 +276,8 @@ static void sync_cmos_clock(unsigned lon
 
 void notify_arch_cmos_timer(void)
 {
-   mod_timer(sync_cmos_timer, jiffies + 1);
+   if (!no_sync_cmos_clock)
+   mod_timer(sync_cmos_timer, jiffies + 1);
 }
 
 static long clock_cmos_diff;
diff -r 77e4058e936b arch/i386/kernel/tsc.c
--- a/arch/i386/kernel/tsc.cThu Dec 14 16:40:14 2006 -0800
+++ b/arch/i386/kernel/tsc.cThu Dec 14 16:40:16 2006 -0800
@@ -23,6 +23,7 @@
  * an extra value to store the TSC freq
  */
 unsigned int tsc_khz;
+unsigned long long (*custom_sched_clock)(void);
 
 int tsc_disable __cpuinitdata = 0;
 
@@ 

[PATCH 0/6] VMI paravirt-ops patches

2006-12-14 Thread Zachary Amsden
These are the patches for the VMI backend to paravirt-ops.  Base
kernel where I tested them was 2.6.19-git20.

Basically, there are only a couple of hooks needed that were left
out of the initial paravirt-ops merge, and then the backend code
is a very straightforward implementation of the paravirt-ops
functions.

Andrew or Linus, please apply or shoot me nasty feedback that I
will promptly turn into marvelous looking code.  I've Cc'd Andi,
who originally was going to take up the patches, but seems to
have been snowed in.

Zach
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/6] Page allocation hooks for VMI backend

2006-12-14 Thread Zachary Amsden
The VMI backend uses explicit page type notification to track shadow
page tables.  The allocation of page table roots is especially tricky.
We need to clone the root for non-PAE mode while it is protected under
the pgd lock to correctly copy the shadow.

We don't need to allocate pgds in PAE mode, (PDPs in Intel terminology)
as they only have 4 entries, and are cached entirely by the processor,
which makes shadowing them rather simple.

For base page table level allocation, pmd_populate provides the exact hook
point we need.  Also, we need to allocate pages when splitting a large page,
and we must release pages before returning the page to any free pool.

Despite being required with these slightly odd semantics for VMI, Xen also 
uses these hooks to determine the exact moment when page tables are created
or released.

Subject: Page allocation hooks for VMI backend
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]

===
--- a/arch/i386/kernel/paravirt.c
+++ b/arch/i386/kernel/paravirt.c
@@ -545,6 +545,12 @@ struct paravirt_ops paravirt_ops = {
.flush_tlb_kernel = native_flush_tlb_global,
.flush_tlb_single = native_flush_tlb_single,
 
+   .alloc_pt = (void *)native_nop,
+   .alloc_pd = (void *)native_nop,
+   .alloc_pd_clone = (void *)native_nop,
+   .release_pt = (void *)native_nop,
+   .release_pd = (void *)native_nop,
+
.set_pte = native_set_pte,
.set_pte_at = native_set_pte_at,
.set_pmd = native_set_pmd,
===
--- a/arch/i386/mm/init.c
+++ b/arch/i386/mm/init.c
@@ -62,6 +62,7 @@ static pmd_t * __init one_md_table_init(

 #ifdef CONFIG_X86_PAE
pmd_table = (pmd_t *) alloc_bootmem_low_pages(PAGE_SIZE);
+   paravirt_alloc_pd(__pa(pmd_table)  PAGE_SHIFT);
set_pgd(pgd, __pgd(__pa(pmd_table) | _PAGE_PRESENT));
pud = pud_offset(pgd, 0);
if (pmd_table != pmd_offset(pud, 0)) 
@@ -82,6 +83,7 @@ static pte_t * __init one_page_table_ini
 {
if (pmd_none(*pmd)) {
pte_t *page_table = (pte_t *) 
alloc_bootmem_low_pages(PAGE_SIZE);
+   paravirt_alloc_pt(__pa(page_table)  PAGE_SHIFT);
set_pmd(pmd, __pmd(__pa(page_table) | _PAGE_TABLE));
if (page_table != pte_offset_kernel(pmd, 0))
BUG();  
@@ -347,6 +349,8 @@ static void __init pagetable_init (void)
/* Init entries of the first-level page table to the zero page */
for (i = 0; i  PTRS_PER_PGD; i++)
set_pgd(pgd_base + i, __pgd(__pa(empty_zero_page) | 
_PAGE_PRESENT));
+#else
+   paravirt_alloc_pd(__pa(swapper_pg_dir)  PAGE_SHIFT);
 #endif
 
/* Enable PSE if available */
===
--- a/arch/i386/mm/pageattr.c
+++ b/arch/i386/mm/pageattr.c
@@ -60,6 +60,7 @@ static struct page *split_large_page(uns
address = __pa(address);
addr = address  LARGE_PAGE_MASK; 
pbase = (pte_t *)page_address(base);
+   paravirt_alloc_pt(page_to_pfn(base));
for (i = 0; i  PTRS_PER_PTE; i++, addr += PAGE_SIZE) {
set_pte(pbase[i], pfn_pte(addr  PAGE_SHIFT,
   addr == address ? prot : ref_prot));
@@ -166,6 +167,7 @@ __change_page_attr(struct page *page, pg
if (!PageReserved(kpte_page)) {
if (cpu_has_pse  (page_private(kpte_page) == 0)) {
ClearPagePrivate(kpte_page);
+   paravirt_release_pt(page_to_pfn(kpte_page));
list_add(kpte_page-lru, df_list);
revert_page(kpte_page, address);
}
===
--- a/arch/i386/mm/pgtable.c
+++ b/arch/i386/mm/pgtable.c
@@ -245,8 +245,14 @@ void pgd_ctor(void *pgd, kmem_cache_t *c
clone_pgd_range((pgd_t *)pgd + USER_PTRS_PER_PGD,
swapper_pg_dir + USER_PTRS_PER_PGD,
KERNEL_PGD_PTRS);
+
if (PTRS_PER_PMD  1)
return;
+
+   /* must happen under lock */
+   paravirt_alloc_pd_clone(__pa(pgd)  PAGE_SHIFT,
+   __pa(swapper_pg_dir)  PAGE_SHIFT,
+   USER_PTRS_PER_PGD, PTRS_PER_PGD - USER_PTRS_PER_PGD);
 
pgd_list_add(pgd);
spin_unlock_irqrestore(pgd_lock, flags);
@@ -257,6 +263,7 @@ void pgd_dtor(void *pgd, kmem_cache_t *c
 {
unsigned long flags; /* can be called from interrupt context */
 
+   paravirt_release_pd(__pa(pgd)  PAGE_SHIFT);
spin_lock_irqsave(pgd_lock, flags);
pgd_list_del(pgd);
spin_unlock_irqrestore(pgd_lock, flags);
@@ -274,13 +281,18 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
pmd_t *pmd = kmem_cache_alloc(pmd_cache, GFP_KERNEL);
if (!pmd)

[PATCH 5/6] VMI backend for paravirt-ops

2006-12-14 Thread Zachary Amsden
Fairly straightforward implementation of VMI backend for paravirt-ops.

Subject: VMI backend for paravirt-ops
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]

diff -r d8711b11c1eb arch/i386/Kconfig
--- a/arch/i386/Kconfig Tue Dec 12 13:51:06 2006 -0800
+++ b/arch/i386/Kconfig Tue Dec 12 13:51:13 2006 -0800
@@ -192,6 +192,15 @@ config PARAVIRT
  under a hypervisor, improving performance significantly.
  However, when run without a hypervisor the kernel is
  theoretically slower.  If in doubt, say N.
+
+config VMI
+   bool VMI Paravirt-ops support
+   depends on PARAVIRT
+   default y
+   help
+ VMI provides a paravirtualized interface to multiple hypervisors
+ include VMware ESX server and Xen by connecting to a ROM module
+ provided by the hypervisor.
 
 config ACPI_SRAT
bool
diff -r d8711b11c1eb arch/i386/kernel/Makefile
--- a/arch/i386/kernel/Makefile Tue Dec 12 13:51:06 2006 -0800
+++ b/arch/i386/kernel/Makefile Tue Dec 12 13:51:13 2006 -0800
@@ -39,6 +39,8 @@ obj-$(CONFIG_EARLY_PRINTK)+= early_prin
 obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
 obj-$(CONFIG_HPET_TIMER)   += hpet.o
 obj-$(CONFIG_K8_NB)+= k8.o
+
+obj-$(CONFIG_VMI)  += vmi.o
 
 # Make sure this is linked after any other paravirt_ops structs: see head.S
 obj-$(CONFIG_PARAVIRT) += paravirt.o
diff -r d8711b11c1eb arch/i386/kernel/head.S
--- a/arch/i386/kernel/head.S   Tue Dec 12 13:51:06 2006 -0800
+++ b/arch/i386/kernel/head.S   Tue Dec 12 13:51:13 2006 -0800
@@ -360,7 +360,7 @@ 1:  movb $1,X86_HARD_MATH
  * cpu_gdt_table and boot_pda; for secondary CPUs, these will be
  * that CPU's GDT and PDA.
  */
-setup_pda:
+ENTRY(setup_pda)
/* get the PDA pointer */
movl start_pda, %eax
 
diff -r d8711b11c1eb arch/i386/kernel/io_apic.c
--- a/arch/i386/kernel/io_apic.cTue Dec 12 13:51:06 2006 -0800
+++ b/arch/i386/kernel/io_apic.cTue Dec 12 13:51:13 2006 -0800
@@ -1914,7 +1914,7 @@ static void __init setup_ioapic_ids_from
 static void __init setup_ioapic_ids_from_mpc(void) { }
 #endif
 
-static int no_timer_check __initdata;
+int no_timer_check __initdata;
 
 static int __init notimercheck(char *s)
 {
diff -r d8711b11c1eb arch/i386/kernel/setup.c
--- a/arch/i386/kernel/setup.c  Tue Dec 12 13:51:06 2006 -0800
+++ b/arch/i386/kernel/setup.c  Tue Dec 12 13:51:13 2006 -0800
@@ -60,6 +60,7 @@
 #include asm/io_apic.h
 #include asm/ist.h
 #include asm/io.h
+#include asm/vmi.h
 #include setup_arch.h
 #include bios_ebda.h
 
@@ -581,6 +582,14 @@ void __init setup_arch(char **cmdline_p)
 
max_low_pfn = setup_memory();
 
+#ifdef CONFIG_VMI
+   /*
+* Must be after max_low_pfn is determined, and before kernel
+* pagetables are setup.
+*/
+   vmi_init();
+#endif
+
/*
 * NOTE: before this point _nobody_ is allowed to allocate
 * any memory using the bootmem allocator.  Although the
diff -r d8711b11c1eb arch/i386/kernel/smpboot.c
--- a/arch/i386/kernel/smpboot.cTue Dec 12 13:51:06 2006 -0800
+++ b/arch/i386/kernel/smpboot.cTue Dec 12 13:51:13 2006 -0800
@@ -63,6 +63,7 @@
 #include mach_apic.h
 #include mach_wakecpu.h
 #include smpboot_hooks.h
+#include asm/vmi.h
 
 /* Set if we find a B stepping CPU */
 static int __devinitdata smp_b_stepping;
@@ -547,6 +548,9 @@ static void __devinit start_secondary(vo
 * booting is too fragile that we want to limit the
 * things done here to the most necessary things.
 */
+#ifdef CONFIG_VMI
+   vmi_bringup();
+#endif
secondary_cpu_init();
preempt_disable();
smp_callin();
diff -r d8711b11c1eb arch/i386/mm/pgtable.c
--- a/arch/i386/mm/pgtable.cTue Dec 12 13:51:06 2006 -0800
+++ b/arch/i386/mm/pgtable.cTue Dec 12 13:51:13 2006 -0800
@@ -171,6 +171,8 @@ void reserve_top_address(unsigned long r
 void reserve_top_address(unsigned long reserve)
 {
BUG_ON(fixmaps  0);
+   printk(KERN_INFO Reserving virtual address space above 0x%08x\n,
+  (int)-reserve);
 #ifdef CONFIG_COMPAT_VDSO
BUG_ON(reserve != 0);
 #else
diff -r d8711b11c1eb include/asm-i386/timer.h
--- a/include/asm-i386/timer.h  Tue Dec 12 13:51:06 2006 -0800
+++ b/include/asm-i386/timer.h  Tue Dec 12 13:51:13 2006 -0800
@@ -8,6 +8,7 @@ void setup_pit_timer(void);
 /* Modifiers for buggy PIT handling */
 extern int pit_latch_buggy;
 extern int timer_ack;
+extern int no_timer_check;
 extern int recalibrate_cpu_khz(void);
 
 #endif
diff -r d8711b11c1eb arch/i386/kernel/vmi.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +
+++ b/arch/i386/kernel/vmi.cTue Dec 12 13:51:13 2006 -0800
@@ -0,0 +1,901 @@
+/*
+ * VMI specific paravirt-ops implementation
+ *
+ * Copyright (C) 2005, VMware, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; 

[PATCH 2/6] Paravirt CPU hypercall batching mode

2006-12-14 Thread Zachary Amsden
The VMI ROM has a mode where hypercalls can be queued and batched.  This turns
out to be a significant win during context switch, but must be done at a
specific point before side effects to CPU state are visible to subsequent
instructions.  This is similar to the MMU batching hooks already provided.
The same hooks could be used by the Xen backend to implement a context switch
multicall.

To explain a bit more about lazy modes in the paravirt patches, basically, the
idea is that only one of lazy CPU or MMU mode can be active at any given time.
Lazy MMU mode is similar to this lazy CPU mode, and allows for batching of
multiple PTE updates (say, inside a remap loop), but to avoid keeping some kind
of state machine about when to flush cpu or mmu updates, we just allow one or
the other to be active.  Although there is no real reason a more comprehensive
scheme could not be implemented, there is also no demonstrated need for this
extra complexity.

Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
Subject: Paravirt CPU hypercall batching mode

diff -r 01f2e46c1416 arch/i386/kernel/paravirt.c
--- a/arch/i386/kernel/paravirt.c   Thu Dec 14 14:26:24 2006 -0800
+++ b/arch/i386/kernel/paravirt.c   Thu Dec 14 14:44:56 2006 -0800
@@ -545,6 +545,7 @@ struct paravirt_ops paravirt_ops = {
.apic_write_atomic = native_apic_write_atomic,
.apic_read = native_apic_read,
 #endif
+   .set_lazy_mode = (void *)native_nop,
 
.flush_tlb_user = native_flush_tlb,
.flush_tlb_kernel = native_flush_tlb_global,
diff -r 01f2e46c1416 arch/i386/kernel/process.c
--- a/arch/i386/kernel/process.cThu Dec 14 14:26:24 2006 -0800
+++ b/arch/i386/kernel/process.cThu Dec 14 14:50:22 2006 -0800
@@ -665,6 +665,31 @@ struct task_struct fastcall * __switch_t
load_TLS(next, cpu);
 
/*
+* Now maybe handle debug registers and/or IO bitmaps
+*/
+   if (unlikely((task_thread_info(next_p)-flags  _TIF_WORK_CTXSW)
+   || test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)))
+   __switch_to_xtra(next_p, tss);
+
+   disable_tsc(prev_p, next_p);
+
+   /*
+* Leave lazy mode, flushing any hypercalls made here.
+* This must be done before restoring TLS segments so
+* the GDT and LDT are properly updated, and must be
+* done before math_state_restore, so the TS bit is up
+* to date.
+*/
+   arch_leave_lazy_cpu_mode();
+
+   /* If the task has used fpu the last 5 timeslices, just do a full
+* restore of the math state immediately to avoid the trap; the
+* chances of needing FPU soon are obviously high now
+*/
+   if (next_p-fpu_counter  5)
+   math_state_restore();
+
+   /*
 * Restore %fs if needed.
 *
 * Glibc normally makes %fs be zero.
@@ -673,22 +698,6 @@ struct task_struct fastcall * __switch_t
loadsegment(fs, next-fs);
 
write_pda(pcurrent, next_p);
-
-   /*
-* Now maybe handle debug registers and/or IO bitmaps
-*/
-   if (unlikely((task_thread_info(next_p)-flags  _TIF_WORK_CTXSW)
-   || test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)))
-   __switch_to_xtra(next_p, tss);
-
-   disable_tsc(prev_p, next_p);
-
-   /* If the task has used fpu the last 5 timeslices, just do a full
-* restore of the math state immediately to avoid the trap; the
-* chances of needing FPU soon are obviously high now
-*/
-   if (next_p-fpu_counter  5)
-   math_state_restore();
 
return prev_p;
 }
diff -r 01f2e46c1416 include/asm-generic/pgtable.h
--- a/include/asm-generic/pgtable.h Thu Dec 14 14:26:24 2006 -0800
+++ b/include/asm-generic/pgtable.h Thu Dec 14 14:44:56 2006 -0800
@@ -183,6 +183,19 @@ static inline void ptep_set_wrprotect(st
 #endif
 
 /*
+ * A facility to provide batching of the reload of page tables with the
+ * actual context switch code for paravirtualized guests.  By convention,
+ * only one of the lazy modes (CPU, MMU) should be active at any given
+ * time, entry should never be nested, and entry and exits should always
+ * be paired.  This is for sanity of maintaining and reasoning about the
+ * kernel code.
+ */
+#ifndef __HAVE_ARCH_ENTER_LAZY_CPU_MODE
+#define arch_enter_lazy_cpu_mode() do {} while (0)
+#define arch_leave_lazy_cpu_mode() do {} while (0)
+#endif
+
+/*
  * When walking page tables, get the address of the next boundary,
  * or the end address of the range if that comes earlier.  Although no
  * vma end wraps to 0, rounded up __boundary may wrap to 0 throughout.
diff -r 01f2e46c1416 include/asm-i386/paravirt.h
--- a/include/asm-i386/paravirt.h   Thu Dec 14 14:26:24 2006 -0800
+++ b/include/asm-i386/paravirt.h   Thu Dec 14 14:44:56 2006 -0800
@@ -146,6 +146,8 @@ struct paravirt_ops
void (fastcall *pmd_clear)(pmd_t *pmdp);
 #endif
 
+   void (fastcall 

[PATCH 3/6] IOPL handling for paravirt guests

2006-12-14 Thread Zachary Amsden
I found a clever way to make the extra IOPL switching invisible to
non-paravirt compiles - since kernel_rpl is statically defined to
be zero there, and only non-zero rpl kernel have a problem restoring IOPL,
as popf does not restore IOPL flags unless run at CPL-0.

Subject: IOPL handling for paravirt guests
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]

diff -r 8110943fd7ad arch/i386/kernel/process.c
--- a/arch/i386/kernel/process.cThu Dec 14 16:15:20 2006 -0800
+++ b/arch/i386/kernel/process.cThu Dec 14 16:21:57 2006 -0800
@@ -665,6 +665,15 @@ struct task_struct fastcall * __switch_t
load_TLS(next, cpu);
 
/*
+* Restore IOPL if needed.  In normal use, the flags restore
+* in the switch assembly will handle this.  But if the kernel
+* is running virtualized at a non-zero CPL, the popf will
+* not restore flags, so it must be done in a separate step.
+*/
+   if (get_kernel_rpl()  unlikely(prev-iopl != next-iopl))
+   set_iopl_mask(next-iopl);
+
+   /*
 * Now maybe handle debug registers and/or IO bitmaps
 */
if (unlikely((task_thread_info(next_p)-flags  _TIF_WORK_CTXSW)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kref refcnt and false positives

2006-12-14 Thread Eric Dumazet

Andrew Morton a écrit :

On Wed, 13 Dec 2006 16:12:46 -0800
Greg KH [EMAIL PROTECTED] wrote:


Original comment seemed to indicate that this conditional thing was
performance related. Is it really? If not, we should consider the below patch.

Yes, it's a performance gain and I don't see how this patch would change
the above warning.


I suspect it's a false optimisation.

int kref_put(struct kref *kref, void (*release)(struct kref *kref))
{
WARN_ON(release == NULL);
WARN_ON(release == (void (*)(struct kref *))kfree);

/*
 * if current count is one, we are the last user and can release object
 * right now, avoiding an atomic operation on 'refcount'
 */
if ((atomic_read(kref-refcount) == 1) ||
(atomic_dec_and_test(kref-refcount))) {
release(kref);
return 1;
}
return 0;
}

The only time we avoid the atomic_dec_and_test() is when the object is
about to be freed.  ie: once in its entire lifetime.  And freeing the
object is part of an expensive (and rare) operation anyway.

otoh, we've gone and added a test-n-branch to the common case: those cases
where the object will not be freed.



I agree this 'optimization' is not good (I was the guy who suggested it 
http://lkml.org/lkml/2006/1/30/4 )


After Eric Biederman message (http://lkml.org/lkml/2006/1/30/292) I remember 
adding some stat counters and telling Greg to not put the patch in because 
kref_put() was mostly called with refcount=1. But the patch did its way. I 
*did* ask Greg to revert it, but cannot find this mail archived somewhere...


But I believe Venkatesh problem comes from its release() function : It is 
supposed to free the object.

If not, it should properly setup it so that further uses are OK.

ie doing in release(kref)
atomic_set(kref-count, 0);

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread David Schwartz

 Someone also mentioned that we could just put a nice poem into the
 kernel module image in order to be able to enforce our copyright license
 in any court of law.

   Full bellies of fish
   Penguins sleep under the moon
   Dream of wings that fly

 thanks,

Whoever says that has no understanding of copyright law. Copyright law
*only* protects something when there are a large number of equally-good ways
to accomplish the same thing. If there is only one way to accomplish a
particular function, it cannot be protected by copyright.

The Lexmark v. Static Controls case made this pretty clear. Lexmark did
pretty much the same thing with their toner cartridges. You cannot copyright
a password to get the effect of a patent (ownership of every way to
accomplish a particular function).

By the way, the GPL seems to prohibit this. Why is this not an additional
restriction? Where does the GPL say that you cannot create and use a
derivative work unless you put a notice in it stating that it is licensed
under the GPL?

I agree with Linus that this is insane hypocrisy. To be totally blunt, the
want to do this -- to control the way other people use the works they own --
is the same evil impulse that drives the RIAA. Shame on you. It's supposed
to be about free as in freedom.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Add allowed_affinity to the irq_desc to make it possible to have restricted irqs

2006-12-14 Thread Arjan van de Ven

Eric W. Biederman wrote:

What is the problem you are trying to solve?


2 problems
1) irq's that irqbalance should not touch at all
2) irqs that can only go to a subset of processors.

1) is very real today
2) is partially real on some of the bigger numa stuff already.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread James Morris
On Wed, 13 Dec 2006, Martin J. Bligh wrote:

 The point of banning binary drivers would be to leverage hardware
 companies into either releasing open source drivers, or the specs for
 someone else to write them.

IMHO, it's up to the users to decide if they want to keep buying hardware 
which leads to inferior support, less reliability, decreased security and 
all of the other ills associated with binary drivers.  Let them also 
choose distributions which enact the binary driver policies they agree 
with.

Linux is precisely not about forcing people to do things.


- James
-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [stable] [PATCH 46/61] fix Intel RNG detection

2006-12-14 Thread Jan Beulich
with the patch it boots perfectly without any command-line args.

Are you getting the 'Firmware space is locked read-only' message then?

Thanks, Jan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread David Woodhouse
On Wed, 2006-12-13 at 16:55 -0800, Greg KH wrote:
 Oh, and for those who have asked me how we would enforce this after this
 date if this decision is made, I'd like to go on record that I will be
 glad to take whatever legal means necessary to stop people from
 violating this. 

I see no _overriding_ reason to wait. This is a technical measure which
they'd need to deliberately work around, and which might make the case
easier to win -- but I think I'm on record already as planning to sue
someone soon for binary-only modules, even without this particular
technical measure to prevent them.

The only reason it hasn't happened so far is because lawyers make me
itch.

-- 
dwmw2

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread David Woodhouse
On Wed, 2006-12-13 at 20:15 -0800, Linus Torvalds wrote:
 If a module arguably isn't a derived work, we simply shouldn't try to say 
 that its authors have to conform to our worldview.

I wouldn't argue that _anyone_ else should be exposed to my worldview; I
think the Geneva Convention has something to say about cruel and unusual
punishments.

But I would ask that they honour the licence on the code I release, and
perhaps more importantly on the code I import from other GPL sources.

If they fail to do that under the 'honour system' then I'm not averse to
'enforcing' it by technical measures. (For some value of 'enforcement'
which is easy for them to patch out if their lawyers are _really_ sure
they'll win when I sue them, that is.)

That's the big difference I see between this and the RIAA case you
mention -- in the case of Linux refusing to load non-GPL modules, if the
user _really_ thinks they'll win in court they can just hack it to load
the offending modules again. We are giving a _very_ strong indication of
our intent, but we aren't actually _forcing_ it on them in quite the
same way. With DRM-crippled players and hardware it's not so easy to get
around.

I'm very much in favour of Greg's approach. Give 12 months warning and
then just prevent loading of non-GPL modules.

That way, we get back from the current binary modules are the status
quo even though some people are currently psyching themselves up to sue
for it to binary modules are possible if you're _very_ sure of your
legal position and willing to defend it. I think that's a very good
thing to do.

 We should make decisions on TECHNICAL MERIT. And this one is clearly being 
 pushed on anything but.

Not on my part. The thing that makes me _particularly_ vehement about
binary-only crap this week is a very much a technical issue -- in
particular, the fact that we had to do post-production board
modifications to shoot our wireless chip in the head when it goes AWOL,
because the code for it wasn't available to us.

It's come back time and time again -- closed code is undebuggable,
unportable, unimprovable, unworkable. It's a detriment to the whole
system. That's very much a _technical_ issue, to me.

For non-kernel code I'm happy enough to release what I write under a BSD
licence. I'll default to GPL but usually respond favourably to requests
to do otherwise. It _isn't_ a religious issue.

 Same goes for code. Copyright is about _distribution_, not about use.
 We shouldn't limit how people use the code.

And we don't need to. Aside from the fact that they can patch out the
check if they have a genuine need to, they can also mark their module as
GPL without consequences as long as they don't _distribute_ it. We still
don't limit their _use_ of it.

 Oh, well. I realize nobody is likely going to listen to me, and everybody 
 has their opinion set in stone. 

My opinion is fairly much set from all the times I've come up against
_technical_ issues, I'll admit. But I did listen, and I agree with what
you say about the RIAA 'enforcement'. But I do see that as _very_
different to our 'enforcement', because ours is so easy to patch out
it's more of a 'hint' than a lockdown.

 That said, I'm going to suggest that you people talk to your COMPANY 
 LAWYERS on this, and I'm personally not going to merge that particular 
 code unless you can convince the people you work for to merge it first.

We've already merged EXPORT_SYMBOL_GPL. Is there a difference other than
one of extent? What about just marking kmalloc as EXPORT_SYMBOL_GPL for
a start? :)

 In other words, you guys know my stance. I'll not fight the combined 
 opinion of other kernel developers, but I sure as hell won't be the first 
 to merge this, and I sure as hell won't have _my_ tree be the one that 
 causes this to happen.
 
 So go get it merged in the Ubuntu, (Open)SuSE and RHEL and Fedora trees 
 first. This is not something where we use my tree as a way to get it to 
 other trees. This is something where the push had better come from the 
 other direction.

It's better to have a coherent approach, and for all of us to do it on
roughly the same timescale. Getting the distributions do so this is
going to be like herding cats -- having it upstream and letting it
trickle down is a much better approach, I think.

-- 
dwmw2

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Add allowed_affinity to the irq_desc to make it possible to have restricted irqs

2006-12-14 Thread Eric W. Biederman
Arjan van de Ven [EMAIL PROTECTED] writes:

 Eric W. Biederman wrote:
 What is the problem you are trying to solve?

 2 problems
 1) irq's that irqbalance should not touch at all

This is easy we just need a single bit.  Not 128+ bytes on the huge
machines.

 2) irqs that can only go to a subset of processors.

 1) is very real today
 2) is partially real on some of the bigger numa stuff already.

You have said you the NUMA cases is handled in another way already?
In which case irqs that can only got to a subset of processors
shouldn't be a problem.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Add allowed_affinity to the irq_desc to make it possible to have restricted irqs

2006-12-14 Thread Arjan van de Ven

Eric W. Biederman wrote:

1) is very real today
2) is partially real on some of the bigger numa stuff already.


You have said you the NUMA cases is handled in another way already?


the numa case of I prefer that cpu is handled. Not the I cannot 
work on those.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Interphase Tachyon drivers missing.

2006-12-14 Thread Martin K. Petersen
 Eike == Rolf Eike Beer [EMAIL PROTECTED] writes:

Eike Am Mittwoch, 13. Dezember 2006 17:51 schrieb
Eike [EMAIL PROTECTED]:
 I'm not sure about the driver being cpqfc, I know in 2.6.0  1 the
 driver was definitely iphase.c/h/o I do know the chipset was used
 by almost everyone, Compaq/HP/DEC and Interphase's namebrand cards.
 
 I also know that the driver is still working in 2.4.33 my slackware
 11 default kernel picked up the card, which suprised me to say the
 least...  I won't have time to spend a weekend on it until about
 christmas. {or probably christmas day is more likely} Even then I
 can't make any kind of promise that I can do anything useful about
 it...

Eike Ok, than we're likely talking about different things. Maybe just
Eike another driver for that chipset. If I'll ever find some time
Eike I'll have a look on this one too.

The ip5526 driver was removed way back due to lack of interest.  It
only drove a limited set of cards from one vendor.

The interphase cards used a real Tachyon (HPFC-5000) chip.  The
controllers we usually discuss in the context of cpqfc have TachLite
(HPFC-51xx and later).

Tachyon is a really old chip and it's not completely compatible with
TachLite from a programming perspective.  It also doesn't have
contemporary features like - cough - PCI-support.  The
GSC/EISA/PCI/whatever glue chip was vendor-specific.

I'm sure the ip5526 driver could be revived -- it's not very big.  But
I doubt there are many cards out there that haven't been scrapped.

-- 
Martin K. Petersen  http://mkp.net/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-14 Thread Igmar Palsenberg

  I'll put a .config and a dmesg of the machine booting at 
  http://www.jdi-ict.nl/plain/ for those who want to look at it.
 
 dmesg : http://www.jdi-ict.nl/plain/lnx01.dmesg
 Kernel config : http://www.jdi-ict.nl/plain/lnx01.config

Hmm.. Switching CONFIG_HZ from 1000 to 250 seems to 'fix' the problem. 
I haven't seen the issue in nearly a week now. This makes Andrew's theory 
about missing interrupts very likely.

Andrew / others : Is there a way to find out if it *is* missing 
interrupts ?


Regards,


Igmar
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH] Make entries in the Device drivers menu individually selectable

2006-12-14 Thread Robert P. J. Day

  i've posted on this before so here's a slightly-updated patch that
uses the kbuild menuconfig feature to make numerous entries under
the Device drivers menu selectable on the spot.  if folks think this
is a good idea, what's the best way to get it in?

  i could officially submit the patch as is or, if that's too
wide-sweeping since it hits a lot of subsystems, leave it up to the
individual subsystem maintainers to decide for themselves and submit
their own patch.

  (the patch below modifies those entries for which a menuconfig
entry was immediately obvious and shouldn't affect any of the
underlying logic.  that's why some entries were deliberately left out
of the patch, at least for now.)


 drivers/ata/Kconfig |8 ++--
 drivers/connector/Kconfig   |8 
 drivers/dma/Kconfig |   10 +-
 drivers/edac/Kconfig|8 
 drivers/hwmon/Kconfig   |8 
 drivers/i2c/Kconfig |9 -
 drivers/ide/Kconfig |6 +-
 drivers/ieee1394/Kconfig|7 ---
 drivers/infiniband/Kconfig  |   10 +-
 drivers/isdn/Kconfig|9 -
 drivers/leds/Kconfig|9 +++--
 drivers/md/Kconfig  |8 
 drivers/message/i2o/Kconfig |   12 +---
 drivers/mmc/Kconfig |8 
 drivers/mtd/Kconfig |8 
 drivers/parport/Kconfig |8 
 drivers/pnp/Kconfig |8 
 drivers/spi/Kconfig |8 
 drivers/telephony/Kconfig   |9 -
 drivers/w1/Kconfig  |8 
 20 files changed, 77 insertions(+), 92 deletions(-)

diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig
index 984ab28..a3bdf04 100644
--- a/drivers/ata/Kconfig
+++ b/drivers/ata/Kconfig
@@ -2,10 +2,8 @@
 # SATA/PATA driver configuration
 #

-menu Serial ATA (prod) and Parallel ATA (experimental) drivers
-
-config ATA
-   tristate ATA device support
+menuconfig ATA
+   tristate Serial ATA (prod) and Parallel ATA (experimental) drivers
depends on BLOCK
depends on !(M32R || M68K) || BROKEN
depends on !SUN4 || BROKEN
@@ -519,5 +517,3 @@ config PATA_IXP4XX_CF
  If unsure, say N.

 endif
-endmenu
-
diff --git a/drivers/connector/Kconfig b/drivers/connector/Kconfig
index e0bdc0d..9a5a061 100644
--- a/drivers/connector/Kconfig
+++ b/drivers/connector/Kconfig
@@ -1,6 +1,4 @@
-menu Connector - unified userspace - kernelspace linker
-
-config CONNECTOR
+menuconfig CONNECTOR
tristate Connector - unified userspace - kernelspace linker
depends on NET
---help---
@@ -10,6 +8,8 @@ config CONNECTOR
  Connector support can also be built as a module.  If so, the module
  will be called cn.ko.

+if CONNECTOR
+
 config PROC_EVENTS
boolean Report process events to userspace
depends on CONNECTOR=y
@@ -18,4 +18,4 @@ config PROC_EVENTS
  Provide a connector that reports process events to userspace. Send
  events such as fork, exec, id change (uid, gid, suid, etc), and exit.

-endmenu
+endif
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 30d021d..b1fb8c0 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -2,14 +2,14 @@
 # DMA engine configuration
 #

-menu DMA Engine support
-
-config DMA_ENGINE
-   bool Support for DMA engines
+menuconfig DMA_ENGINE
+   bool DMA engine support
---help---
  DMA engines offload copy operations from the CPU to dedicated
  hardware, allowing the copies to happen asynchronously.

+if DMA_ENGINE
+
 comment DMA Clients

 config NET_DMA
@@ -31,4 +31,4 @@ config INTEL_IOATDMA
---help---
  Enable support for the Intel(R) I/OAT DMA engine.

-endmenu
+endif
diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 4f08984..e52e9b0 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -6,10 +6,9 @@
 # $Id: Kconfig,v 1.4.2.7 2005/07/08 22:05:38 dsp_llnl Exp $
 #

-menu 'EDAC - error detection and reporting (RAS) (EXPERIMENTAL)'

-config EDAC
-   tristate EDAC core system error reporting (EXPERIMENTAL)
+menuconfig EDAC
+   tristate 'EDAC - error detection and reporting (RAS) (EXPERIMENTAL)'
depends on X86  EXPERIMENTAL
help
  EDAC is designed to report errors in the core system.
@@ -29,6 +28,7 @@ config EDAC
  There is also a mailing list for the EDAC project, which can
  be found via the sourceforge page.

+if EDAC

 comment Reporting subsystems
depends on EDAC
@@ -110,4 +110,4 @@ config EDAC_POLL

 endchoice

-endmenu
+endif
diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index 891ef6d..7a8afaa 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -2,9 +2,7 @@
 # Hardware monitoring chip drivers configuration
 #

-menu Hardware Monitoring support
-
-config HWMON
+menuconfig HWMON
tristate Hardware Monitoring support
default y

Re: [patch] Add allowed_affinity to the irq_desc to make it possible to have restricted irqs

2006-12-14 Thread Eric W. Biederman
Arjan van de Ven [EMAIL PROTECTED] writes:

 Eric W. Biederman wrote:
 1) is very real today
 2) is partially real on some of the bigger numa stuff already.

 You have said you the NUMA cases is handled in another way already?

 the numa case of I prefer that cpu is handled. Not the I cannot work on
 those.

How is the NUMA case of I prefer that cpu handled?

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[take28 1/8] kevent: Description.

2006-12-14 Thread Evgeniy Polyakov

Description.


diff --git a/Documentation/kevent.txt b/Documentation/kevent.txt
new file mode 100644
index 000..2e03a3f
--- /dev/null
+++ b/Documentation/kevent.txt
@@ -0,0 +1,240 @@
+Description.
+
+int kevent_init(struct kevent_ring *ring, unsigned int ring_size, 
+   unsigned int flags);
+
+num - size of the ring buffer in events 
+ring - pointer to allocated ring buffer
+flags - various flags, see KEVENT_FLAGS_* definitions.
+
+Return value: kevent control file descriptor or negative error value.
+
+ struct kevent_ring
+ {
+   unsigned int ring_kidx, ring_over;
+   struct ukevent event[0];
+ }
+
+ring_kidx - index in the ring buffer where kernel will put new events 
+   when kevent_wait() or kevent_get_events() is called 
+ring_over - number of overflows of ring_uidx happend from the start.
+   Overflow counter is used to prevent situation when two threads 
+   are going to free the same events, but one of them was scheduled 
+   away for too long, so ring indexes were wrapped, so when that 
+   thread will be awakened, it will free not those events, which 
+   it suppose to free.
+
+Example userspace code (ring_buffer.c) can be found on project's homepage.
+
+Each kevent syscall can be so called cancellation point in glibc, i.e. when 
+thread has been cancelled in kevent syscall, thread can be safely removed 
+and no events will be lost, since each syscall (kevent_wait() or 
+kevent_get_events()) will copy event into special ring buffer, accessible 
+from other threads or even processes (if shared memory is used).
+
+When kevent is removed (not dequeued when it is ready, but just removed), 
+even if it was ready, it is not copied into ring buffer, since if it is 
+removed, no one cares about it (otherwise user would wait until it becomes 
+ready and got it through usual way using kevent_get_events() or kevent_wait()) 
+and thus no need to copy it to the ring buffer.
+
+---
+
+
+int kevent_ctl(int fd, unsigned int cmd, unsigned int num, struct ukevent 
*arg);
+
+fd - is the file descriptor referring to the kevent queue to manipulate. 
+It is created by opening /dev/kevent char device, which is created with 
+dynamic minor number and major number assigned for misc devices. 
+
+cmd - is the requested operation. It can be one of the following:
+KEVENT_CTL_ADD - add event notification 
+KEVENT_CTL_REMOVE - remove event notification 
+KEVENT_CTL_MODIFY - modify existing notification 
+KEVENT_CTL_READY - mark existing events as ready, if number of events is 
zero,
+   it just wakes up parked in syscall thread
+
+num - number of struct ukevent in the array pointed to by arg 
+arg - array of struct ukevent
+
+Return value: 
+ number of events processed or negative error value.
+
+When called, kevent_ctl will carry out the operation specified in the 
+cmd parameter.
+---
+
+ int kevent_get_events(int ctl_fd, unsigned int min_nr, unsigned int max_nr, 
+   struct timespec timeout, struct ukevent *buf, unsigned flags);
+
+ctl_fd - file descriptor referring to the kevent queue 
+min_nr - minimum number of completed events that kevent_get_events will block 
+waiting for 
+max_nr - number of struct ukevent in buf 
+timeout - time to wait before returning less than min_nr 
+ events. If this is -1, then wait forever. 
+buf - pointer to an array of struct ukevent. 
+flags - various flags, see KEVENT_FLAGS_* definitions.
+
+Return value:
+ number of events copied or negative error value.
+
+kevent_get_events will wait timeout milliseconds for at least min_nr completed 
+events, copying completed struct ukevents to buf and deleting any 
+KEVENT_REQ_ONESHOT event requests. In nonblocking mode it returns as many 
+events as possible, but not more than max_nr. In blocking mode it waits until 
+timeout or if at least min_nr events are ready.
+
+This function copies event into ring buffer if it was initialized, if ring 
buffer
+is full, KEVENT_RET_COPY_FAILED flag is set in ret_flags field.
+---
+
+ int kevent_wait(int ctl_fd, unsigned int num, unsigned int old_uidx, 
+   struct timespec timeout, unsigned int flags);
+
+ctl_fd - file descriptor referring to the kevent queue 
+num - number of processed kevents 
+old_uidx - the last index user is aware of
+timeout - time to wait until there is free space in kevent queue
+flags - various flags, see KEVENT_FLAGS_* definitions.
+
+Return value:
+ number of events copied into ring buffer or negative error value.
+
+This syscall waits until either timeout expires or at least one event becomes 
+ready. It also copies events into special ring buffer. If ring buffer is full,
+it waits until there are ready events and then return.
+If kevent is one-shot kevent it is 

[take28 4/8] kevent: Socket notifications.

2006-12-14 Thread Evgeniy Polyakov

Socket notifications.

This patch includes socket send/recv/accept notifications.
Using trivial web server based on kevent and this features
instead of epoll it's performance increased more than noticebly.
More details about various benchmarks and server itself 
(evserver_kevent.c) can be found on project's homepage.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/fs/inode.c b/fs/inode.c
index ada7643..2740617 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -21,6 +21,7 @@
 #include linux/cdev.h
 #include linux/bootmem.h
 #include linux/inotify.h
+#include linux/kevent.h
 #include linux/mount.h
 
 /*
@@ -164,12 +165,18 @@ static struct inode *alloc_inode(struct super_block *sb)
}
inode-i_private = 0;
inode-i_mapping = mapping;
+#if defined CONFIG_KEVENT_SOCKET || defined CONFIG_KEVENT_PIPE
+   kevent_storage_init(inode, inode-st);
+#endif
}
return inode;
 }
 
 void destroy_inode(struct inode *inode) 
 {
+#if defined CONFIG_KEVENT_SOCKET || defined CONFIG_KEVENT_PIPE
+   kevent_storage_fini(inode-st);
+#endif
BUG_ON(inode_has_buffers(inode));
security_inode_free(inode);
if (inode-i_sb-s_op-destroy_inode)
diff --git a/include/net/sock.h b/include/net/sock.h
index edd4d73..d48ded8 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -48,6 +48,7 @@
 #include linux/netdevice.h
 #include linux/skbuff.h  /* struct sk_buff */
 #include linux/security.h
+#include linux/kevent.h
 
 #include linux/filter.h
 
@@ -450,6 +451,21 @@ static inline int sk_stream_memory_free(struct sock *sk)
 
 extern void sk_stream_rfree(struct sk_buff *skb);
 
+struct socket_alloc {
+   struct socket socket;
+   struct inode vfs_inode;
+};
+
+static inline struct socket *SOCKET_I(struct inode *inode)
+{
+   return container_of(inode, struct socket_alloc, vfs_inode)-socket;
+}
+
+static inline struct inode *SOCK_INODE(struct socket *socket)
+{
+   return container_of(socket, struct socket_alloc, socket)-vfs_inode;
+}
+
 static inline void sk_stream_set_owner_r(struct sk_buff *skb, struct sock *sk)
 {
skb-sk = sk;
@@ -477,6 +493,7 @@ static inline void sk_add_backlog(struct sock *sk, struct 
sk_buff *skb)
sk-sk_backlog.tail = skb;
}
skb-next = NULL;
+   kevent_socket_notify(sk, KEVENT_SOCKET_RECV);
 }
 
 #define sk_wait_event(__sk, __timeo, __condition)  \
@@ -679,21 +696,6 @@ static inline struct kiocb *siocb_to_kiocb(struct 
sock_iocb *si)
return si-kiocb;
 }
 
-struct socket_alloc {
-   struct socket socket;
-   struct inode vfs_inode;
-};
-
-static inline struct socket *SOCKET_I(struct inode *inode)
-{
-   return container_of(inode, struct socket_alloc, vfs_inode)-socket;
-}
-
-static inline struct inode *SOCK_INODE(struct socket *socket)
-{
-   return container_of(socket, struct socket_alloc, socket)-vfs_inode;
-}
-
 extern void __sk_stream_mem_reclaim(struct sock *sk);
 extern int sk_stream_mem_schedule(struct sock *sk, int size, int kind);
 
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 7a093d0..69f4ad2 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -857,6 +857,7 @@ static inline int tcp_prequeue(struct sock *sk, struct 
sk_buff *skb)
tp-ucopy.memory = 0;
} else if (skb_queue_len(tp-ucopy.prequeue) == 1) {
wake_up_interruptible(sk-sk_sleep);
+   kevent_socket_notify(sk, 
KEVENT_SOCKET_RECV|KEVENT_SOCKET_SEND);
if (!inet_csk_ack_scheduled(sk))
inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,
  (3 * TCP_RTO_MIN) / 4,
diff --git a/kernel/kevent/kevent_socket.c b/kernel/kevent/kevent_socket.c
new file mode 100644
index 000..1798092
--- /dev/null
+++ b/kernel/kevent/kevent_socket.c
@@ -0,0 +1,144 @@
+/*
+ * kevent_socket.c
+ * 
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include linux/kernel.h
+#include linux/types.h
+#include linux/list.h
+#include linux/slab.h
+#include linux/spinlock.h
+#include linux/timer.h
+#include linux/file.h

[take28 5/8] kevent: Timer notifications.

2006-12-14 Thread Evgeniy Polyakov

Timer notifications.

Timer notifications can be used for fine grained per-process time 
management, since interval timers are very inconvenient to use, 
and they are limited.

This subsystem uses high-resolution timers.
id.raw[0] is used as number of seconds
id.raw[1] is used as number of nanoseconds

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/kernel/kevent/kevent_timer.c b/kernel/kevent/kevent_timer.c
new file mode 100644
index 000..c21a155
--- /dev/null
+++ b/kernel/kevent/kevent_timer.c
@@ -0,0 +1,114 @@
+/*
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include linux/kernel.h
+#include linux/types.h
+#include linux/list.h
+#include linux/slab.h
+#include linux/spinlock.h
+#include linux/hrtimer.h
+#include linux/jiffies.h
+#include linux/kevent.h
+
+struct kevent_timer
+{
+   struct hrtimer  ktimer;
+   struct kevent_storage   ktimer_storage;
+   struct kevent   *ktimer_event;
+};
+
+static int kevent_timer_func(struct hrtimer *timer)
+{
+   struct kevent_timer *t = container_of(timer, struct kevent_timer, 
ktimer);
+   struct kevent *k = t-ktimer_event;
+
+   kevent_storage_ready(t-ktimer_storage, NULL, KEVENT_MASK_ALL);
+   hrtimer_forward(timer, timer-base-softirq_time,
+   ktime_set(k-event.id.raw[0], k-event.id.raw[1]));
+   return HRTIMER_RESTART;
+}
+
+static struct lock_class_key kevent_timer_key;
+
+static int kevent_timer_enqueue(struct kevent *k)
+{
+   int err;
+   struct kevent_timer *t;
+
+   t = kmalloc(sizeof(struct kevent_timer), GFP_KERNEL);
+   if (!t)
+   return -ENOMEM;
+
+   hrtimer_init(t-ktimer, CLOCK_MONOTONIC, HRTIMER_REL);
+   t-ktimer.expires = ktime_set(k-event.id.raw[0], k-event.id.raw[1]);
+   t-ktimer.function = kevent_timer_func;
+   t-ktimer_event = k;
+
+   err = kevent_storage_init(t-ktimer, t-ktimer_storage);
+   if (err)
+   goto err_out_free;
+   lockdep_set_class(t-ktimer_storage.lock, kevent_timer_key);
+
+   err = kevent_storage_enqueue(t-ktimer_storage, k);
+   if (err)
+   goto err_out_st_fini;
+
+   hrtimer_start(t-ktimer, t-ktimer.expires, HRTIMER_REL);
+
+   return 0;
+
+err_out_st_fini:
+   kevent_storage_fini(t-ktimer_storage);
+err_out_free:
+   kfree(t);
+
+   return err;
+}
+
+static int kevent_timer_dequeue(struct kevent *k)
+{
+   struct kevent_storage *st = k-st;
+   struct kevent_timer *t = container_of(st, struct kevent_timer, 
ktimer_storage);
+
+   hrtimer_cancel(t-ktimer);
+   kevent_storage_dequeue(st, k);
+   kfree(t);
+
+   return 0;
+}
+
+static int kevent_timer_callback(struct kevent *k)
+{
+   k-event.ret_data[0] = jiffies_to_msecs(jiffies);
+   return 1;
+}
+
+static int __init kevent_init_timer(void)
+{
+   struct kevent_callbacks tc = {
+   .callback = kevent_timer_callback,
+   .enqueue = kevent_timer_enqueue,
+   .dequeue = kevent_timer_dequeue,
+   .flags = 0,
+   };
+
+   return kevent_add_callbacks(tc, KEVENT_TIMER);
+}
+module_init(kevent_init_timer);
+

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[take28 0/8] kevent: Generic event handling mechanism.

2006-12-14 Thread Evgeniy Polyakov

Generic event handling mechanism.

Kevent is a generic subsytem which allows to handle event notifications.
It supports both level and edge triggered events. It is similar to
poll/epoll in some cases, but it is more scalable, it is faster and
allows to work with essentially eny kind of events.

Events are provided into kernel through control syscall and can be read
back through ring buffer or using usual syscalls.
Kevent update (i.e. readiness switching) happens directly from internals
of the appropriate state machine of the underlying subsytem (like
network, filesystem, timer or any other).

Homepage:
http://tservice.net.ru/~s0mbre/old/?section=projectsitem=kevent

Documentation page:
http://linux-net.osdl.org/index.php/Kevent

Consider for inclusion.

New benchmark, which can be a hoax though, can be found at 
http://tservice.net.ru/~s0mbre/blog/2006/11/30#2006_11_30
where kevent on amd64 with 1gb of ram can handle more than 7200 events per 
second with 8000 requests concurrency with 'ab' benchmark and lighttpd.
Although I tought it should not be published due to possible errors,
I decided to send it for review.

With this release I start 3 days resending timeout - i.e. each third day I 
will send either new version (if something new was requested and agreed to 
be implemented) or resending with back counter started from three. 
When back counter hits zero after three resending I consider there is no 
interest in subsystem and I will stop further sending.

Thanks for understanding and your time.

Changes from 'take27' patchset:
 * made kevent default yes in non embedded case.
 * added falgs to callback structures - currently used to check if kevent
can be requested from kernelspace only (posix timers) or 
userspace (all others)

Changes from 'take26' patchset:
 * made kevent visible in config only in case of embedded setup.
 * added comment about KEVENT_MAX number.
 * spell fix.

Changes from 'take25' patchset:
 * use timespec as timeout parameter.
 * added high-resolution timer to handle absolute timeouts.
 * added flags to waiting and initialization syscalls.
 * kevent_commit() has new_uidx parameter.
 * kevent_wait() has old_uidx parameter, which, if not equal to u-uidx,
results in immediate wakeup (usefull for the case when entries
are added asynchronously from kernel (not supported for now)).
 * added interface to mark any event as ready.
 * event POSIX timers support.
 * return -ENOSYS if there is no registered event type.
 * provided file descriptor must be checked for fifo type (spotted by Eric 
Dumazet).
 * signal notifications.
 * documentation update.
 * lighttpd patch updated (the latest benchmarks with lighttpd patch can be 
found in blog).

Changes from 'take24' patchset:
 * new (old (new)) ring buffer implementation with kernel and user indexes.
 * added initialization syscall instead of opening /dev/kevent
 * kevent_commit() syscall to commit ring buffer entries
 * changed KEVENT_REQ_WAKEUP_ONE flag to KEVENT_REQ_WAKEUP_ALL, kevent wakes
   only first thread always if that flag is not set
 * KEVENT_REQ_ALWAYS_QUEUE flag. If set, kevent will be queued into ready queue
   instead of copying back to userspace when kevent is ready immediately when
   it is added.
 * lighttpd patch (Hail! Although nothing really outstanding compared to epoll)

Changes from 'take23' patchset:
 * kevent PIPE notifications
 * KEVENT_REQ_LAST_CHECK flag, which allows to perform last check at dequeueing 
time
 * fixed poll/select notifications (were broken due to tree manipulations)
 * made Documentation/kevent.txt look nice in 80-col terminal
 * fix for copy_to_user() failure report for the first kevent (Andrew Morton)
 * minor function renames

Changes from 'take22' patchset:
 * new ring buffer implementation in process' memory
 * wakeup-one-thread flag
 * edge-triggered behaviour

Changes from 'take21' patchset:
 * minor cleanups (different return values, removed unneded variables, 
whitespaces and so on)
 * fixed bug in kevent removal in case when kevent being removed
   is the same as overflow_kevent (spotted by Eric Dumazet)

Changes from 'take20' patchset:
 * new ring buffer implementation
 * removed artificial limit on possible number of kevents

Changes from 'take19' patchset:
 * use __init instead of __devinit
 * removed 'default N' from config for user statistic
 * removed kevent_user_fini() since kevent can not be unloaded
 * use KERN_INFO for statistic output

Changes from 'take18' patchset:
 * use __init instead of __devinit
 * removed 'default N' from config for user statistic
 * removed kevent_user_fini() since kevent can not be unloaded
 * use KERN_INFO for statistic output

Changes from 'take17' patchset:
 * Use RB tree instead of hash table. 
At least for a web sever, frequency of addition/deletion of new kevent 
is comparable with number of search access, i.e. most of the time 
events 
are added, accesed only couple of times and then 

[take28 3/8] kevent: poll/select() notifications.

2006-12-14 Thread Evgeniy Polyakov

poll/select() notifications.

This patch includes generic poll/select notifications.
kevent_poll works simialr to epoll and has the same issues (callback
is invoked not from internal state machine of the caller, but through
process awake, a lot of allocations and so on).

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/fs/file_table.c b/fs/file_table.c
index bc35a40..0805547 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -20,6 +20,7 @@
 #include linux/cdev.h
 #include linux/fsnotify.h
 #include linux/sysctl.h
+#include linux/kevent.h
 #include linux/percpu_counter.h
 
 #include asm/atomic.h
@@ -119,6 +120,7 @@ struct file *get_empty_filp(void)
f-f_uid = tsk-fsuid;
f-f_gid = tsk-fsgid;
eventpoll_init_file(f);
+   kevent_init_file(f);
/* f-f_version: 0 */
return f;
 
@@ -164,6 +166,7 @@ void fastcall __fput(struct file *file)
 * in the file cleanup chain.
 */
eventpoll_release(file);
+   kevent_cleanup_file(file);
locks_remove_flock(file);
 
if (file-f_op  file-f_op-release)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5baf3a1..8bbf3a5 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -276,6 +276,7 @@ extern int dir_notify_enable;
 #include linux/init.h
 #include linux/sched.h
 #include linux/mutex.h
+#include linux/kevent_storage.h
 
 #include asm/atomic.h
 #include asm/semaphore.h
@@ -586,6 +587,10 @@ struct inode {
struct mutexinotify_mutex;  /* protects the watches list */
 #endif
 
+#if defined CONFIG_KEVENT_SOCKET || defined CONFIG_KEVENT_PIPE
+   struct kevent_storage   st;
+#endif
+
unsigned long   i_state;
unsigned long   dirtied_when;   /* jiffies of first dirtying */
 
@@ -739,6 +744,9 @@ struct file {
struct list_headf_ep_links;
spinlock_t  f_ep_lock;
 #endif /* #ifdef CONFIG_EPOLL */
+#ifdef CONFIG_KEVENT_POLL
+   struct kevent_storage   st;
+#endif
struct address_space*f_mapping;
 };
 extern spinlock_t files_lock;
diff --git a/kernel/kevent/kevent_poll.c b/kernel/kevent/kevent_poll.c
new file mode 100644
index 000..7ccf7da
--- /dev/null
+++ b/kernel/kevent/kevent_poll.c
@@ -0,0 +1,234 @@
+/*
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include linux/kernel.h
+#include linux/types.h
+#include linux/list.h
+#include linux/slab.h
+#include linux/spinlock.h
+#include linux/timer.h
+#include linux/file.h
+#include linux/kevent.h
+#include linux/poll.h
+#include linux/fs.h
+
+static kmem_cache_t *kevent_poll_container_cache;
+static kmem_cache_t *kevent_poll_priv_cache;
+
+struct kevent_poll_ctl
+{
+   struct poll_table_structpt;
+   struct kevent   *k;
+};
+
+struct kevent_poll_wait_container
+{
+   struct list_headcontainer_entry;
+   wait_queue_head_t   *whead;
+   wait_queue_twait;
+   struct kevent   *k;
+};
+
+struct kevent_poll_private
+{
+   struct list_headcontainer_list;
+   spinlock_t  container_lock;
+};
+
+static int kevent_poll_enqueue(struct kevent *k);
+static int kevent_poll_dequeue(struct kevent *k);
+static int kevent_poll_callback(struct kevent *k);
+
+static int kevent_poll_wait_callback(wait_queue_t *wait,
+   unsigned mode, int sync, void *key)
+{
+   struct kevent_poll_wait_container *cont =
+   container_of(wait, struct kevent_poll_wait_container, wait);
+   struct kevent *k = cont-k;
+
+   kevent_storage_ready(k-st, NULL, KEVENT_MASK_ALL);
+   return 0;
+}
+
+static void kevent_poll_qproc(struct file *file, wait_queue_head_t *whead,
+   struct poll_table_struct *poll_table)
+{
+   struct kevent *k =
+   container_of(poll_table, struct kevent_poll_ctl, pt)-k;
+   struct kevent_poll_private *priv = k-priv;
+   struct kevent_poll_wait_container *cont;
+   unsigned long flags;
+
+   cont = kmem_cache_alloc(kevent_poll_container_cache, GFP_KERNEL);
+   if (!cont) {
+   kevent_break(k);
+   return;
+   }
+
+   cont-k = k;
+   init_waitqueue_func_entry(cont-wait, kevent_poll_wait_callback);
+   cont-whead = whead;
+
+   spin_lock_irqsave(priv-container_lock, flags);
+   

[take28 6/8] kevent: Pipe notifications.

2006-12-14 Thread Evgeniy Polyakov

Pipe notifications.


diff --git a/fs/pipe.c b/fs/pipe.c
index f3b6f71..aeaee9c 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -16,6 +16,7 @@
 #include linux/uio.h
 #include linux/highmem.h
 #include linux/pagemap.h
+#include linux/kevent.h
 
 #include asm/uaccess.h
 #include asm/ioctls.h
@@ -312,6 +313,7 @@ redo:
break;
}
if (do_wakeup) {
+   kevent_pipe_notify(inode, KEVENT_SOCKET_SEND);
wake_up_interruptible_sync(pipe-wait);
kill_fasync(pipe-fasync_writers, SIGIO, POLL_OUT);
}
@@ -321,6 +323,7 @@ redo:
 
/* Signal writers asynchronously that there is more room. */
if (do_wakeup) {
+   kevent_pipe_notify(inode, KEVENT_SOCKET_SEND);
wake_up_interruptible(pipe-wait);
kill_fasync(pipe-fasync_writers, SIGIO, POLL_OUT);
}
@@ -490,6 +493,7 @@ redo2:
break;
}
if (do_wakeup) {
+   kevent_pipe_notify(inode, KEVENT_SOCKET_RECV);
wake_up_interruptible_sync(pipe-wait);
kill_fasync(pipe-fasync_readers, SIGIO, POLL_IN);
do_wakeup = 0;
@@ -501,6 +505,7 @@ redo2:
 out:
mutex_unlock(inode-i_mutex);
if (do_wakeup) {
+   kevent_pipe_notify(inode, KEVENT_SOCKET_RECV);
wake_up_interruptible(pipe-wait);
kill_fasync(pipe-fasync_readers, SIGIO, POLL_IN);
}
@@ -605,6 +610,7 @@ pipe_release(struct inode *inode, int decr, int decw)
free_pipe_info(inode);
} else {
wake_up_interruptible(pipe-wait);
+   kevent_pipe_notify(inode, 
KEVENT_SOCKET_SEND|KEVENT_SOCKET_RECV);
kill_fasync(pipe-fasync_readers, SIGIO, POLL_IN);
kill_fasync(pipe-fasync_writers, SIGIO, POLL_OUT);
}
diff --git a/kernel/kevent/kevent_pipe.c b/kernel/kevent/kevent_pipe.c
new file mode 100644
index 000..91dc1eb
--- /dev/null
+++ b/kernel/kevent/kevent_pipe.c
@@ -0,0 +1,123 @@
+/*
+ * kevent_pipe.c
+ * 
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include linux/kernel.h
+#include linux/types.h
+#include linux/slab.h
+#include linux/spinlock.h
+#include linux/file.h
+#include linux/fs.h
+#include linux/kevent.h
+#include linux/pipe_fs_i.h
+
+static int kevent_pipe_callback(struct kevent *k)
+{
+   struct inode *inode = k-st-origin;
+   struct pipe_inode_info *pipe = inode-i_pipe;
+   int nrbufs = pipe-nrbufs;
+
+   if (k-event.event  KEVENT_SOCKET_RECV  nrbufs  0) {
+   if (!pipe-writers)
+   return -1;
+   return 1;
+   }
+   
+   if (k-event.event  KEVENT_SOCKET_SEND  nrbufs  PIPE_BUFFERS) {
+   if (!pipe-readers)
+   return -1;
+   return 1;
+   }
+
+   return 0;
+}
+
+int kevent_pipe_enqueue(struct kevent *k)
+{
+   struct file *pipe;
+   int err = -EBADF;
+   struct inode *inode;
+
+   pipe = fget(k-event.id.raw[0]);
+   if (!pipe)
+   goto err_out_exit;
+
+   inode = igrab(pipe-f_dentry-d_inode);
+   if (!inode)
+   goto err_out_fput;
+
+   err = -EINVAL;
+   if (!S_ISFIFO(inode-i_mode))
+   goto err_out_iput;
+
+   err = kevent_storage_enqueue(inode-st, k);
+   if (err)
+   goto err_out_iput;
+
+   if (k-event.req_flags  KEVENT_REQ_ALWAYS_QUEUE) {
+   kevent_requeue(k);
+   err = 0;
+   } else {
+   err = k-callbacks.callback(k);
+   if (err)
+   goto err_out_dequeue;
+   }
+
+   fput(pipe);
+
+   return err;
+
+err_out_dequeue:
+   kevent_storage_dequeue(k-st, k);
+err_out_iput:
+   iput(inode);
+err_out_fput:
+   fput(pipe);
+err_out_exit:
+   return err;
+}
+
+int kevent_pipe_dequeue(struct kevent *k)
+{
+   struct inode *inode = k-st-origin;
+
+   kevent_storage_dequeue(k-st, k);
+   iput(inode);
+
+   return 0;
+}
+
+void kevent_pipe_notify(struct inode 

[take28 8/8] kevent: Kevent posix timer notifications.

2006-12-14 Thread Evgeniy Polyakov

Kevent posix timer notifications.

Simple extensions to POSIX timers which allows
to deliver notification of the timer expiration
through kevent queue.

Example application posix_timer.c can be found
in archive on project homepage.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]


diff --git a/include/asm-generic/siginfo.h b/include/asm-generic/siginfo.h
index 8786e01..3768746 100644
--- a/include/asm-generic/siginfo.h
+++ b/include/asm-generic/siginfo.h
@@ -235,6 +235,7 @@ typedef struct siginfo {
 #define SIGEV_NONE 1   /* other notification: meaningless */
 #define SIGEV_THREAD   2   /* deliver via thread creation */
 #define SIGEV_THREAD_ID 4  /* deliver to thread */
+#define SIGEV_KEVENT   8   /* deliver through kevent queue */
 
 /*
  * This works because the alignment is ok on all current architectures
@@ -260,6 +261,8 @@ typedef struct sigevent {
void (*_function)(sigval_t);
void *_attribute;   /* really pthread_attr_t */
} _sigev_thread;
+
+   int kevent_fd;
} _sigev_un;
 } sigevent_t;
 
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index a7dd38f..4b9deb4 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -4,6 +4,7 @@
 #include linux/spinlock.h
 #include linux/list.h
 #include linux/sched.h
+#include linux/kevent_storage.h
 
 union cpu_time_count {
cputime_t cpu;
@@ -49,6 +50,9 @@ struct k_itimer {
sigval_t it_sigev_value;/* value word of sigevent struct */
struct task_struct *it_process; /* process to send signal to */
struct sigqueue *sigq;  /* signal queue entry. */
+#ifdef CONFIG_KEVENT_TIMER
+   struct kevent_storage st;
+#endif
union {
struct {
struct hrtimer timer;
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index e5ebcc1..74270f8 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -48,6 +48,8 @@
 #include linux/wait.h
 #include linux/workqueue.h
 #include linux/module.h
+#include linux/kevent.h
+#include linux/file.h
 
 /*
  * Management arrays for POSIX timers.  Timers are kept in slab memory
@@ -224,6 +226,100 @@ static int posix_ktime_get_ts(clockid_t which_clock, 
struct timespec *tp)
return 0;
 }
 
+#ifdef CONFIG_KEVENT_TIMER
+static int posix_kevent_enqueue(struct kevent *k)
+{
+   /*
+* It is not ugly - there is no pointer in the id field union, 
+* but its size is 64bits, which is ok for any known pointer size.
+*/
+   struct k_itimer *tmr = (struct k_itimer *)(unsigned 
long)k-event.id.raw_u64;
+   return kevent_storage_enqueue(tmr-st, k);
+}
+static int posix_kevent_dequeue(struct kevent *k)
+{
+   struct k_itimer *tmr = (struct k_itimer *)(unsigned 
long)k-event.id.raw_u64;
+   kevent_storage_dequeue(tmr-st, k);
+   return 0;
+}
+static int posix_kevent_callback(struct kevent *k)
+{
+   return 1;
+}
+static int posix_kevent_init(void)
+{
+   struct kevent_callbacks tc = {
+   .callback = posix_kevent_callback,
+   .enqueue = posix_kevent_enqueue,
+   .dequeue = posix_kevent_dequeue,
+   .flags = KEVENT_CALLBACKS_KERNELONLY};
+
+   return kevent_add_callbacks(tc, KEVENT_POSIX_TIMER);
+}
+
+extern struct file_operations kevent_user_fops;
+
+static int posix_kevent_init_timer(struct k_itimer *tmr, int fd)
+{
+   struct ukevent uk;
+   struct file *file;
+   struct kevent_user *u;
+   int err;
+
+   file = fget(fd);
+   if (!file) {
+   err = -EBADF;
+   goto err_out;
+   }
+
+   if (file-f_op != kevent_user_fops) {
+   err = -EINVAL;
+   goto err_out_fput;
+   }
+
+   u = file-private_data;
+
+   memset(uk, 0, sizeof(struct ukevent));
+
+   uk.event = KEVENT_MASK_ALL;
+   uk.type = KEVENT_POSIX_TIMER;
+   uk.id.raw_u64 = (unsigned long)(tmr); /* Just cast to something unique 
*/
+   uk.req_flags = KEVENT_REQ_ONESHOT | KEVENT_REQ_ALWAYS_QUEUE;
+   uk.ptr = tmr-it_sigev_value.sival_ptr;
+
+   err = kevent_user_add_ukevent(uk, u);
+   if (err)
+   goto err_out_fput;
+
+   fput(file);
+
+   return 0;
+
+err_out_fput:
+   fput(file);
+err_out:
+   return err;
+}
+
+static void posix_kevent_fini_timer(struct k_itimer *tmr)
+{
+   kevent_storage_fini(tmr-st);
+}
+#else
+static int posix_kevent_init_timer(struct k_itimer *tmr, int fd)
+{
+   return -ENOSYS;
+}
+static int posix_kevent_init(void)
+{
+   return 0;
+}
+static void posix_kevent_fini_timer(struct k_itimer *tmr)
+{
+}
+#endif
+
+
 /*
  * Initialize everything, well, just everything in Posix clocks/timers ;)
  */
@@ -241,6 +337,11 @@ static __init int init_posix_timers(void)
register_posix_clock(CLOCK_REALTIME, clock_realtime);

Re: [stable] [PATCH 46/61] fix Intel RNG detection

2006-12-14 Thread dean gaudet
On Thu, 14 Dec 2006, Jan Beulich wrote:

 with the patch it boots perfectly without any command-line args.
 
 Are you getting the 'Firmware space is locked read-only' message then?

yep...

so let me ask a naive question... don't we want the firmware locked 
read-only because that protects the bios from viruses?  honestly i'm naive 
in this area of pc hardware, but i'm kind of confused why we'd want 
unlocked firmware just so we can detect a RNG.

-dean
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Add allowed_affinity to the irq_desc to make it possible to have restricted irqs

2006-12-14 Thread Arjan van de Ven

  the numa case of I prefer that cpu is handled. Not the I cannot work on
  those.
 
 How is the NUMA case of I prefer that cpu handled?

it's exported via /sys/bus/pci/devices/device/local_cpus
(and the irq is in the /irq directory next to local_cpus)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-14 Thread Andrew Morton
On Thu, 14 Dec 2006 09:15:39 +0100 (CET)
Igmar Palsenberg [EMAIL PROTECTED] wrote:

 
   I'll put a .config and a dmesg of the machine booting at 
   http://www.jdi-ict.nl/plain/ for those who want to look at it.
  
  dmesg : http://www.jdi-ict.nl/plain/lnx01.dmesg
  Kernel config : http://www.jdi-ict.nl/plain/lnx01.config
 
 Hmm.. Switching CONFIG_HZ from 1000 to 250 seems to 'fix' the problem. 
 I haven't seen the issue in nearly a week now. This makes Andrew's theory 
 about missing interrupts very likely.
 
 Andrew / others : Is there a way to find out if it *is* missing 
 interrupts ?
 

umm, nasty.  What's in /proc/interrupts?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread Greg KH
On Thu, Dec 14, 2006 at 12:10:15AM -0500, Bill Nottingham wrote:
 
 Greg KH ([EMAIL PROTECTED]) said: 
  An updated version is below.
 
 If you're adding this, you should probably schedule EXPORT_SYMBOL_GPL
 for removal at the same time, as this essentially renders that irrelevant.
 
 That being said...
 
 First, this is adding the measure at module load time. Any copyright
 infringment happens on distribution; module load isn't (necessarily)
 that; if I write random code and load it, without ever sending it
 to anyone, I'm not violating the license, and this would prevent that.
 So it seems somewhat misplaced.

Yes, as Linus points out, this is the main point here, my apologies.
GPL covers distribution, not usage, no matter how much the people
working on v3 want to change that :)

Even if we change the kernel this way, it prevents valid and legal
usages of the kernel.  So I am wrong, sorry.

 Secondly...
 
  Oh, and for those who have asked me how we would enforce this after this
  date if this decision is made, I'd like to go on record that I will be
  glad to take whatever legal means necessary to stop people from
  violating this.
 
 There's nothing stopping you undertaking these means now. Addition of
 this measure doesn't change the copyright status of any code - what was
 a violation before would still be a violation.

Agreed, and I have done this in the past.  I only stated this because it
seems that some people keep just wishing this whole issue would go away
if they ignore it.

 Hence, the only purpose of a clause like this legally would seem to be
 to *intentionally* go after people using the DMCA. Which seems... tacky.

Despite my wardrobe consisting mainly of old t-shirts and jeans, I still
never want to be called tacky :)

It's just that I'm so damn tired of this whole thing.  I'm tired of
people thinking they have a right to violate my copyright all the time.
I'm tired of people and companies somehow treating our license in ways
that are blatantly wrong and feeling fine about it.  Because we are a
loose band of a lot of individuals, and not a company or legal entity,
it seems to give companies the chutzpah to feel that they can get away
with violating our license.

So when someone like Andrew gives me the opportunity to put a stop to
all of the crap that I have to put up with each and every day with a
tiny 2 line patch, I jumped in and took it.  I need to sit back and
remember to see the bigger picture some times, so I apologize to
everyone here.

And yes, it is crap that I deal with every day due to the lovely grey
area that is Linux kernel module licensing these days.  I have customers
that demand we support them despite them mixing three and more different
closed source kernel modules at once and getting upset that I have no
way to help them out.  I have loony video tweakers that hand edit kernel
oopses to try to hide the fact that they are using a binary module
bigger than the sum of the whole kernel and demand that our group fix
their suspend/resume issue for them.  I see executives who say one thing
to the community and then turn around and overrule them just because
someone made a horrible purchasing decision on the brand of laptop wifi
card that they purchased.  I see lawyers who have their hands tied by
attorney-client rules and can not speak out in public for how they
really feel about licenses and how to interpret them.

And in the midst of all of that are the poor users who have no idea who
to listen to.  They don't know what is going on, they just want to use
their hardware and don't give a damm about anyone's license.  And then
there's the distros out there that listen to those users and give them
the working distro as they see a market for it, and again, as a company,
justify to themselves that it must be ok to violate those kernel
developers rights because no one seems to be stopping them so far.

[side diversion, it's not the video drivers that really matter here
everyone, those are just so obvious.  It's the hundreds of other
blatantly infringing binary kernel modules out there that really matter.
The ones that control filesystems, cluster interconnects, disk arrays,
media codecs, and a whole host of custom hardware.  That's the real
problem that Linux faces now and will only get worse in the future.
It's not two stupid little video drivers, I could honestly care less
about them...]

But it's all part of the process, and I can live with it, even if at
times it drives me crazy.

But I know we will succeed, it will just take us a little longer to get
there, so I might as well learn to enjoy the view more.

Even though I really think I can get that patch by the Novell lawyers
and convince management there that it is something we can do, it's not
something that I want to take on, as I think my time can be better spent
coding to advance Linux technically, not fight legal battles.

I'll go delete that module.c patch from my tree now.

thanks,

greg k-h

p.s. I still think the 

Re: BUG: unable to handle kernel paging request in 2.6.19-git

2006-12-14 Thread Ben Castricum


On Tue, 12 Dec 2006, Randy Dunlap wrote:

 On Tue, 12 Dec 2006 07:48:51 +0100 (CET) Ben Castricum wrote:

 
  This bug started to show up after the release of 2.6.19 (iirc plain 2.6.19
  was still working fine).
 
  The full dmesg is at
  http://www.bencastricum.nl/lk/bootmessages-2.6.19-g9202f325.log,
  and the .config http://www.bencastricum.nl/lk/config-g9202f325.log
 
  I haven't tried disabling CONFIG_PCI_MULTITHREAD_PROBE. But if this
  might help in someway I'll give it a shot.

 Yes, it appears to be that config option.  Please disable it
 and retest and re-report.

As expected, disabling CONFIG_PCI_MULTITHREAD_PROBE causes the bug to
disappear.

Regards,
Ben



  Thanks,
  Ben
 
  e100: Intel(R) PRO/100 Network Driver, 3.5.17-k2-NAPI
  e100: Copyright(c) 1999-2006 Intel Corporation
  BUG: unable to handle kernel paging request at virtual address d880a000
   printing eip:
  d880a000
  *pde = 01382067
  *pte = 
  Oops:  [#1]
  Modules linked in: e100 mii ext2 unix
  CPU:0
  EIP:0060:[d880a000]Not tainted VLI
  EFLAGS: 00010282   (2.6.19-g9202f325 #15)
  EIP is at 0xd880a000
  eax: c13c9000   ebx: d8876fe0   ecx: d8876470   edx: d8876470
  esi: d8876fe0   edi: ffed   ebp: d8877014   esp: d7a15f7c
  ds: 007b   es: 007b   ss: 0068
  Process probe-:00:0 (pid: 72, ti=d7a14000 task=d7828560
  task.ti=d7a14000)
  Stack: c01b009a c13c9000 c01b00ec d8876fe0 c13c9000  c01b0126
  c13c9048
 d7821560 c0205b27 d7821560 1fcc 6ab5e081 4ada d7acded0
  d7821560
 c0205aa0 fffc c0128186 0001   c01280d0
  
  Call Trace:
   [c01b009a] pci_call_probe+0xa/0x10
   [c01b00ec] __pci_device_probe+0x4c/0x60
   [c01b0126] pci_device_probe+0x26/0x50
   [c0205b27] really_probe+0x87/0x100
   [c0205aa0] really_probe+0x0/0x100
   [c0128186] kthread+0xb6/0xc0
   [c01280d0] kthread+0x0/0xc0
   [c0103963] kernel_thread_helper+0x7/0x14
   ===
  Code:  Bad EIP value.
  EIP: [d880a000] 0xd880a000 SS:ESP 0068:d7a15f7c

 ---
 ~Randy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-14 Thread Igmar Palsenberg

  Hmm.. Switching CONFIG_HZ from 1000 to 250 seems to 'fix' the problem. 
  I haven't seen the issue in nearly a week now. This makes Andrew's theory 
  about missing interrupts very likely.
  
  Andrew / others : Is there a way to find out if it *is* missing 
  interrupts ?
  
 
 umm, nasty.  What's in /proc/interrupts?

See below. The other machine is mostly identifical, except for i8042 
missing (probably due to running an older kernel, or small differences in 
the kernel config).

Regards,


Igmar

[EMAIL PROTECTED] ~]$ cat /proc/interrupts
   CPU0   CPU1
  0:   73702693   74509271   IO-APIC-edge  timer
  1:  1  1   IO-APIC-edge  i8042
  4:   2289   8389   IO-APIC-edge  serial
  8:  0  1   IO-APIC-edge  rtc
  9:  0  0   IO-APIC-fasteoi   acpi
 12:  3  1   IO-APIC-edge  i8042
 16:  203127788  0   IO-APIC-fasteoi   uhci_hcd:usb2, eth0
 17:525492   IO-APIC-fasteoi   uhci_hcd:usb4
 18:   1370   67584889   IO-APIC-fasteoi   arcmsr
 19:  0  0   IO-APIC-fasteoi   ehci_hcd:usb1
 20:  0  0   IO-APIC-fasteoi   uhci_hcd:usb3
NMI:  0  0
LOC:  148127756  148133476
ERR:  0
MIS:  0
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] more Driver core patches for 2.6.19

2006-12-14 Thread Duncan Sands
 I'm really not convinced about the user-mode thing unless somebody can 
 show me a good reason for it. Not just some wouldn't it be nice kind of 
 thing. A real, honest-to-goodness reason that we actually _want_ to see 
 used.

Qemu?  It would be nice if emulators could directly drive hardware:
useful for reverse engineering windows drivers for example.

Duncan.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.20-rc1] ib_verbs: Use explicit if-else statements to avoid errors with do-while macros

2006-12-14 Thread Andrew Morton
On Thu, 14 Dec 2006 06:56:24 +
Al Viro [EMAIL PROTECTED] wrote:

 On Thu, Dec 14, 2006 at 06:44:30AM +, Al Viro wrote:
  On Wed, Dec 13, 2006 at 10:10:05PM -0500, Ben Collins wrote:
   At least on PPC, the op ? op : dma construct causes a compile failure
   because the dma_* is a do{}while(0) macro.
   
   This turns all of them into proper if/else to avoid this problem.
  
  NAK.
  
  Proper fix is to kill stupid do { } while (0) mess.  It's supposed
  to behave like a function returning void, so it should be ((void)0).
 
 BTW, even though the original patch is already merged, I think that
 we ought to get rid of do-while in such stubs, exactly to avoid such
 problems in the future.  Probably even add to CodingStyle - it's not
 the first time such crap happens.
 
 IOW, do ; while(0) / do { } while (0)  is not a proper way to do a macro
 that imitates a function returning void.
 
 Objections?

Would prefer static inline void foo(args){} when possible - for the arg
typechecking and arg existence checking and unused variable warnings.

I end up having to do rather a lot of things like this:

--- a/mm/vmalloc.c~virtual-memmap-on-sparsemem-v3-map-and-unmap-fix-2
+++ a/mm/vmalloc.c
@@ -929,6 +929,6 @@ int unmap_generic_kernel(unsigned long a
if (err)
break;
} while (pgd++, addr = next, addr != end);
-   flush_tlb_kernel_range((unsigned long)start_addr, end_addr);
+   flush_tlb_kernel_range(addr, addr);
return err;
 }


and this:


@@ -85,12 +84,24 @@ extern void vm_events_fold_cpu(int cpu);
 #else
 
 /* Disable counters */
-#define get_cpu_vm_events(e)   0L
-#define count_vm_event(e)  do { } while (0)
-#define count_vm_events(e,d)   do { } while (0)
-#define __count_vm_event(e)do { } while (0)
-#define __count_vm_events(e,d) do { } while (0)
-#define vm_events_fold_cpu(x)  do { } while (0)
+static inline void count_vm_event(enum vm_event_item item)
+{
+}
+static inline void count_vm_events(enum vm_event_item item, long delta)
+{
+}
+static inline void __count_vm_event(enum vm_event_item item)
+{
+}
+static inline void __count_vm_events(enum vm_event_item item, long delta)
+{
+}
+static inline void all_vm_events(unsigned long *ret)
+{
+}
+static inline void vm_events_fold_cpu(int cpu)
+{
+}

because of these problems.

Plus macros are putrid.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-14 Thread Andrew Morton
On Thu, 14 Dec 2006 09:55:38 +0100 (CET)
Igmar Palsenberg [EMAIL PROTECTED] wrote:

 
   Hmm.. Switching CONFIG_HZ from 1000 to 250 seems to 'fix' the problem. 
   I haven't seen the issue in nearly a week now. This makes Andrew's theory 
   about missing interrupts very likely.
   
   Andrew / others : Is there a way to find out if it *is* missing 
   interrupts ?
   
  
  umm, nasty.  What's in /proc/interrupts?
 
 See below. The other machine is mostly identifical, except for i8042 
 missing (probably due to running an older kernel, or small differences in 
 the kernel config).
 

Does the other machine have the same problems?

Are you able to rule out a hardware failure?

 [EMAIL PROTECTED] ~]$ cat /proc/interrupts
CPU0   CPU1
   0:   73702693   74509271   IO-APIC-edge  timer
   1:  1  1   IO-APIC-edge  i8042
   4:   2289   8389   IO-APIC-edge  serial
   8:  0  1   IO-APIC-edge  rtc
   9:  0  0   IO-APIC-fasteoi   acpi
  12:  3  1   IO-APIC-edge  i8042
  16:  203127788  0   IO-APIC-fasteoi   uhci_hcd:usb2, eth0
  17:525492   IO-APIC-fasteoi   uhci_hcd:usb4
  18:   1370   67584889   IO-APIC-fasteoi   arcmsr
  19:  0  0   IO-APIC-fasteoi   ehci_hcd:usb1
  20:  0  0   IO-APIC-fasteoi   uhci_hcd:usb3
 NMI:  0  0
 LOC:  148127756  148133476
 ERR:  0
 MIS:  0

The disk interrupt is unshared, which rules out a few software problems, I
guess.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] more Driver core patches for 2.6.19

2006-12-14 Thread Thomas Gleixner
On Wed, 2006-12-13 at 23:56 +, Alan wrote:
 On Wed, 13 Dec 2006 23:30:55 +0100
 Thomas Gleixner [EMAIL PROTECTED] wrote:
 
  - IRQ happens
  - kernel handler runs and masks the chip irq, which removes the IRQ
  request
 
 IRQ is shared with the disk driver, box dead.

Err ? 

IRQ happens

IRQ is disabled by the generic handling code

Handler is invoked and checks, whether the irq is from the device or
not. 
 - If not, it returns IRQ_NONE, so the next driver (e.g. disk) is
invoked.
 - If yes, it masks the chip on the device, which disables the chip
interrupt line and returns IRQ_HANDLED.

In both cases the IRQ gets reenabled from the generic irq handling code
on return, so why is the box dead ?

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Wed, Dec 13, 2006 at 09:34:16PM +0100, Karsten Weiss wrote:

 FWIW: As far as I understand the linux kernel code (I am no kernel 
 developer so please correct me if I am wrong) the PCI dma mapping code is 
 abstracted by struct dma_mapping_ops. I.e. there are currently four 
 possible implementations for x86_64 (see
 linux-2.6/arch/x86_64/kernel/)
 
 1. pci-nommu.c : no IOMMU at all (e.g. because you have  4 GB memory)
Kernel boot message: PCI-DMA: Disabling IOMMU.
 
 2. pci-gart.c : (AMD) Hardware-IOMMU.
Kernel boot message: PCI-DMA: using GART IOMMU (this message
first appeared in 2.6.16)
 
 3. pci-swiotlb.c : Software-IOMMU (used e.g. if there is no hw iommu)
Kernel boot message: PCI-DMA: Using software bounce buffering 
for IO (SWIOTLB)

Used if there's no HW IOMMU *and* it's needed (because you have 4GB
memory) or you told the kernel to use it (iommu=soft).

 4. pci-calgary.c : Calgary HW-IOMMU from IBM; used in pSeries servers. 
This HW-IOMMU supports dma address mapping with memory proctection,
etc.
Kernel boot message: PCI-DMA: Using Calgary IOMMU (since
2.6.18!)

Calgary is found in pSeries servers, but also in high-end xSeries
(Intel based) servers. It would be a little awkward if pSeries servers
(which are based on PowerPC processors) used code under arch/x86-64
:-)

 BTW: It would be really great if this area of the kernel would get some 
 more and better documentation. The information at 
 linux-2.6/Documentation/x86_64/boot_options.txt is very terse. I had to 
 read the code to get a *rough* idea what all the iommu= options 
 actually do and how they interact.

Patches happily accepted :-)

Cheers,
Muli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-14 Thread Igmar Palsenberg

  See below. The other machine is mostly identifical, except for i8042 
  missing (probably due to running an older kernel, or small differences in 
  the kernel config).
  
 
 Does the other machine have the same problems?

No, but that machine has a lot less disk and networkactivity.
 
 Are you able to rule out a hardware failure?

100% ? No, but the hardware is relatively new (about a year old), and of 
good quality. It's hard to reprodure, so looking at it when it starts to 
fault isn't possible either :(

 The disk interrupt is unshared, which rules out a few software problems, I
 guess.

Indeed. Bah, I hate these kind of things :(



Regards,


Igmar
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Wed, Dec 13, 2006 at 01:29:25PM -0700, Erik Andersen wrote:
 On Mon Dec 11, 2006 at 10:24:02AM +0100, Karsten Weiss wrote:
  We could not reproduce the data corruption anymore if we boot
  the machines with the kernel parameter iommu=soft i.e. if we
  use software bounce buffering instead of the hw-iommu.
 
 I just realized that booting with iommu=soft makes my pcHDTV
 HD5500 DVB cards not work.  Time to go back to disabling the
 memhole and losing 1 GB.  :-(

That points to a bug in the driver (likely) or swiotlb (unlikely), as
the IOMMU in use should be transparent to the driver. Which driver is
it?

Cheers,
Muli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Thu, Dec 14, 2006 at 12:33:23AM +0100, Christoph Anton Mitterer wrote:

 4)
 And does someone know if the nforce/opteron iommu requires IBM Calgary
 IOMMU support?

It doesn't, Calgary isn't found in machine with Opteron CPUs or NForce
chipsets (AFAIK). However, compiling Calgary in should make no
difference, as we detect in run-time which IOMMU is found and the
machine.

Cheers,
Muli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] more Driver core patches for 2.6.19

2006-12-14 Thread Muli Ben-Yehuda
On Wed, Dec 13, 2006 at 10:15:47PM +0100, Arjan van de Ven wrote:

 with DRI you have the case where something needs to do security
 validation of the commands that are sent to the card. (to avoid a
 non-privileged user to DMA all over your memory)

We also have the interesting case where your card is behind an
isolation-capable IOMMU, so if you let userspace program it, you need
a userspace-accessible DMA-API for IOMMU mappings (or to pre-map
everything in the IOMMU, which loses on some of the benefits of
isolation-capable IOMMUs (i.e., only map what you need to use right
now)).

Cheers,
Muli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Chris Wedgwood
On Wed, Dec 13, 2006 at 09:11:29PM +0100, Christoph Anton Mitterer wrote:

 - error in the Opteron (memory controller)
 - error in the Nvidia chipsets
 - error in the kernel

My guess without further information would be that some, but not all
BIOSes are doing some work to avoid this.

Does anyone have an amd64 with an nforce4 chipset and 4GB that does
NOT have this problem?  If so it might be worth chasing the BIOS
vendors to see what errata they are dealing with.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.19] m32r: Fix do_page_fault and update_mmu_cache

2006-12-14 Thread Hirokazu Takata
Fix do_page_fault and update_mmu_cache.

  * Fix do_page_fault (vmalloc_fault:) to pass error_code correctly
to update_mmu_cache by using a thread-fault code for all m32r chips.

  * Fix update_mmu_cache for OPSP chip
- #ifdef CONFIG_CHIP_OPSP portion is a workaround of OPSP;
  Add a notfound-case operation to update_mmu_cache for OPSP
  like other m32r chip.
- Fix pte_data that was not initialized if no entry found.

Signed-off-by: Kazuhiro Inaoka [EMAIL PROTECTED]
Signed-off-by: Hirokazu Takata [EMAIL PROTECTED]
---
 arch/m32r/mm/fault.c |   40 +++-
 1 files changed, 19 insertions(+), 21 deletions(-)

diff --git a/arch/m32r/mm/fault.c b/arch/m32r/mm/fault.c
index 9b9feb0..fc7ccdf 100644
--- a/arch/m32r/mm/fault.c
+++ b/arch/m32r/mm/fault.c
@@ -362,8 +362,10 @@ vmalloc_fault:
if (!pte_present(*pte_k))
goto no_context;
 
-   addr = (address  PAGE_MASK) | (error_code  ACE_INSTRUCTION);
+   addr = (address  PAGE_MASK);
+   set_thread_fault_code(error_code);
update_mmu_cache(NULL, addr, *pte_k);
+   set_thread_fault_code(0);
return;
}
 }
@@ -377,7 +379,7 @@ vmalloc_fault:
 void update_mmu_cache(struct vm_area_struct *vma, unsigned long vaddr,
pte_t pte)
 {
-   unsigned long *entry1, *entry2;
+   volatile unsigned long *entry1, *entry2;
unsigned long pte_data, flags;
unsigned int *entry_dat;
int inst = get_thread_fault_code()  ACE_INSTRUCTION;
@@ -391,30 +393,26 @@ void update_mmu_cache(struct vm_area_struct *vma, 
unsigned long vaddr,
 
vaddr = (vaddr  PAGE_MASK) | get_asid();
 
+   pte_data = pte_val(pte);
+
 #ifdef CONFIG_CHIP_OPSP
entry1 = (unsigned long *)ITLB_BASE;
-   for(i = 0 ; i  NR_TLB_ENTRIES; i++) {
-   if(*entry1++ == vaddr) {
-   pte_data = pte_val(pte);
-   set_tlb_data(entry1, pte_data);
-   break;
-   }
-   entry1++;
+   for (i = 0; i  NR_TLB_ENTRIES; i++) {
+   if (*entry1++ == vaddr) {
+   set_tlb_data(entry1, pte_data);
+   break;
+   }
+   entry1++;
}
entry2 = (unsigned long *)DTLB_BASE;
-   for(i = 0 ; i  NR_TLB_ENTRIES ; i++) {
-   if(*entry2++ == vaddr) {
-   pte_data = pte_val(pte);
-   set_tlb_data(entry2, pte_data);
-   break;
-   }
-   entry2++;
+   for (i = 0; i  NR_TLB_ENTRIES; i++) {
+   if (*entry2++ == vaddr) {
+   set_tlb_data(entry2, pte_data);
+   break;
+   }
+   entry2++;
}
-   local_irq_restore(flags);
-   return;
 #else
-   pte_data = pte_val(pte);
-
/*
 * Update TLB entries
 *  entry1: ITLB entry address
@@ -439,6 +437,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned 
long vaddr,
i (MSVA_offset), i (MTOP_offset), i (MIDXI_offset)
: r4, memory
);
+#endif
 
if ((!inst  entry2 = DTLB_END) || (inst  entry1 = ITLB_END))
goto notfound;
@@ -482,7 +481,6 @@ notfound:
set_tlb_data(entry1, pte_data);
 
goto found;
-#endif
 }
 
 /*==*
-- 
1.4.4.2

--
Hirokazu Takata [EMAIL PROTECTED]
Linux/M32R Project:  http://www.linux-m32r.org/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.19] m32r: Fix kernel entry address of vmlinux

2006-12-14 Thread Hirokazu Takata
This patch fixes the kernel entry point address of vmlinux.

The m32r kernel entry address is 0x08002000 (physical).
But, so far, the ENTRY point written in vmlinux.lds.S was not point
the correct kernel entry address.

(before fix)
$ objdump -x vmlinux
vmlinux: file format elf32-m32r-linux
vmlinux
architecture: m32r2, flags 0x0112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x88002090/* NG */
:
Sections:
Idx Name  Size  VMA   LMA   File off  Algn
  0 .empty_zero_page 1000  88001000  88001000  1000  2**12
  CONTENTS, ALLOC, LOAD, DATA
  1 .boot 008c  88002000  88002000  2000  2**2
  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .text 001ab694  88002090  88002090  2090  2**4
  CONTENTS, ALLOC, LOAD, READONLY, CODE
:

(after fix)
$ objdump -x vmlinux
vmlinux: file format elf32-m32r-linux
vmlinux
architecture: m32r2, flags 0x0112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x08002000/* OK */
:

This fix also remedies the following GDB error message (of gdb-6.4 or after)
at the first operation of kernel debugging:
Previous frame identical to this frame (corrupt stack?).

Signed-off-by: Hirokazu Takata [EMAIL PROTECTED]
---
 arch/m32r/Makefile |2 +-
 arch/m32r/kernel/vmlinux.lds.S |5 -
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/m32r/Makefile b/arch/m32r/Makefile
index f219c47..cdf63b2 100644
--- a/arch/m32r/Makefile
+++ b/arch/m32r/Makefile
@@ -7,7 +7,7 @@
 
 LDFLAGS:=
 OBJCOPYFLAGS   := -O binary -R .note -R .comment -S
-LDFLAGS_vmlinux:= -e startup_32
+LDFLAGS_vmlinux:=
 
 CFLAGS += -pipe -fno-schedule-insns
 CFLAGS_KERNEL += -mmodel=medium
diff --git a/arch/m32r/kernel/vmlinux.lds.S b/arch/m32r/kernel/vmlinux.lds.S
index 358b9ce..c497a2f 100644
--- a/arch/m32r/kernel/vmlinux.lds.S
+++ b/arch/m32r/kernel/vmlinux.lds.S
@@ -6,12 +6,15 @@
 #include asm/page.h
 
 OUTPUT_ARCH(m32r)
-ENTRY(startup_32)
 #if defined(__LITTLE_ENDIAN__)
jiffies = jiffies_64;
 #else
jiffies = jiffies_64 + 4;
 #endif
+
+kernel_entry = boot - 0x8000;
+ENTRY(kernel_entry)
+
 SECTIONS
 {
   . = CONFIG_MEMORY_START + __PAGE_OFFSET;
-- 
1.4.4.2

--
Hirokazu Takata [EMAIL PROTECTED]
Linux/M32R Project:  http://www.linux-m32r.org/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.19] m32r: Cosmetic updates and trivial fixes

2006-12-14 Thread Hirokazu Takata
Cosmetic updates and trivial fixes of m32r arch-dependent files.
- Remove RCS ID strings and trailing white lines
- Other misc. cosmetic updates

Signed-off-by: Hirokazu Takata [EMAIL PROTECTED]
---
 arch/m32r/kernel/head.S  |2 --
 arch/m32r/lib/ashxdi3.S  |3 ---
 arch/m32r/lib/checksum.S |3 +--
 arch/m32r/lib/delay.c|2 --
 arch/m32r/lib/memcpy.S   |2 --
 arch/m32r/lib/memset.S   |2 --
 arch/m32r/lib/strlen.S   |2 --
 arch/m32r/mm/fault-nommu.c   |5 +
 arch/m32r/mm/mmu.S   |5 +
 include/asm-m32r/a.out.h |2 --
 include/asm-m32r/addrspace.h |1 -
 include/asm-m32r/bugs.h  |2 --
 include/asm-m32r/byteorder.h |2 --
 include/asm-m32r/cache.h |2 --
 include/asm-m32r/cacheflush.h|1 -
 include/asm-m32r/current.h   |3 ---
 include/asm-m32r/delay.h |2 --
 include/asm-m32r/dma.h   |2 --
 include/asm-m32r/errno.h |3 ---
 include/asm-m32r/ide.h   |6 +-
 include/asm-m32r/ioctls.h|5 -
 include/asm-m32r/ipcbuf.h|4 
 include/asm-m32r/kmap_types.h|4 
 include/asm-m32r/m32104ut/m32104ut_pld.h |   11 +--
 include/asm-m32r/m32700ut/m32700ut_lan.h |   13 +
 include/asm-m32r/m32700ut/m32700ut_lcd.h |   13 +
 include/asm-m32r/m32700ut/m32700ut_pld.h |   13 +
 include/asm-m32r/mappi2/mappi2_pld.h |   13 ++---
 include/asm-m32r/mappi3/mappi3_pld.h |   11 +--
 include/asm-m32r/mc146818rtc.h   |3 ---
 include/asm-m32r/mman.h  |2 --
 include/asm-m32r/mmu.h   |   10 ++
 include/asm-m32r/mmu_context.h   |9 ++---
 include/asm-m32r/module.h|3 ---
 include/asm-m32r/msgbuf.h|4 
 include/asm-m32r/namei.h |4 
 include/asm-m32r/opsput/opsput_lan.h |   13 +
 include/asm-m32r/opsput/opsput_lcd.h |   13 +
 include/asm-m32r/opsput/opsput_pld.h |   13 +
 include/asm-m32r/page.h  |5 -
 include/asm-m32r/param.h |4 
 include/asm-m32r/pci.h   |2 --
 include/asm-m32r/pgalloc.h   |3 ---
 include/asm-m32r/pgtable-2level.h|3 ---
 include/asm-m32r/posix_types.h   |4 
 include/asm-m32r/rtc.h   |4 
 include/asm-m32r/scatterlist.h   |2 --
 include/asm-m32r/sections.h  |1 -
 include/asm-m32r/segment.h   |4 
 include/asm-m32r/sembuf.h|4 
 include/asm-m32r/setup.h |4 
 include/asm-m32r/shmbuf.h|4 
 include/asm-m32r/shmparam.h  |2 --
 include/asm-m32r/sigcontext.h|3 ---
 include/asm-m32r/siginfo.h   |2 --
 include/asm-m32r/signal.h|4 
 include/asm-m32r/smp.h   |3 ---
 include/asm-m32r/sockios.h   |2 --
 include/asm-m32r/spinlock_types.h|2 +-
 include/asm-m32r/stat.h  |4 
 include/asm-m32r/string.h|2 --
 include/asm-m32r/syscall.h   |3 ---
 include/asm-m32r/system.h|2 +-
 include/asm-m32r/termbits.h  |4 +---
 include/asm-m32r/termios.h   |2 --
 include/asm-m32r/timex.h |3 ---
 include/asm-m32r/tlbflush.h  |1 -
 include/asm-m32r/types.h |6 +-
 include/asm-m32r/ucontext.h  |2 --
 include/asm-m32r/unaligned.h |8 +---
 include/asm-m32r/unistd.h|2 --
 include/asm-m32r/user.h  |6 --
 include/asm-m32r/vga.h   |4 +---
 include/asm-m32r/xor.h   |2 --
 74 files changed, 68 insertions(+), 258 deletions(-)

diff --git a/arch/m32r/kernel/head.S b/arch/m32r/kernel/head.S
index 0d3c8ee..dab7436 100644
--- a/arch/m32r/kernel/head.S
+++ b/arch/m32r/kernel/head.S
@@ -7,8 +7,6 @@
  *Hitoshi Yamamoto
  */
 
-/* $Id$ */
-
 #include linux/init.h
 __INIT
 __INITDATA
diff --git a/arch/m32r/lib/ashxdi3.S b/arch/m32r/lib/ashxdi3.S
index 107594b..7fc0c19 100644
--- a/arch/m32r/lib/ashxdi3.S
+++ b/arch/m32r/lib/ashxdi3.S
@@ -4,8 +4,6 @@
  * Copyright (C) 2001,2002  Hiroyuki Kondo, and Hirokazu Takata
  *
  */
-/* $Id$ */
-
 
 ;
 ;  input   (r0,r1)  src
@@ -293,4 +291,3 @@ __lshrdi3:
 #endif /* not CONFIG_ISA_DUAL_ISSUE */
 
.end
-
diff --git a/arch/m32r/lib/checksum.S 

Executability of the stack

2006-12-14 Thread Franck Pommereau
Dear Linux developers,

I recently discovered that the Linux kernel on 32 bits x86 processors
reports the stack as being non-executable while it is actually
executable (because located in the same memory segment).

# grep maps /proc/self/maps
bfce8000-bfcfe000 rw-p bfce8000 00:00 0  [stack]

I think there is here a serious security concern has one could consider
to be protected against the execution of code injected on the stack (or
heap) while this is not the case.

Is there any reason for this situation? Is it possible to correct it?
Maybe it comes from sharing source code for 64 bits and 32 bits
architectures but if so, it should be possible (and highly desirable) to
treat 32 bits differently.

Best regards,
Franck Pommereau

---
Below is the output from the ver_linux script:
---
Linux pixie 2.6.17-10-386 #2 Tue Dec 5 22:26:18 UTC 2006 i686 GNU/Linux

Gnu C  4.1.2
Gnu make   3.81
binutils   2.17
util-linux 2.12r
mount  2.12r
module-init-tools  3.2.2
e2fsprogs  1.39
jfsutils   1.1.8
reiserfsprogs  3.6.19
reiser4progs   1.0.5
xfsprogs   2.8.10
pcmcia-cs  3.2.8
PPP2.4.4
Linux C Library libc.2.4
Dynamic linker (ldd)   2.4
Procps 3.2.7
Net-tools  1.60
Console-tools  0.2.3
Sh-utils   5.96
udev   093
Modules Loaded vmnet vmmon binfmt_misc rfcomm hidp l2cap ipv6
usbhid speedstep_centrino cpufreq_userspace cpufreq_stats freq_table
cpufreq_powersave cpufreq_ondemand cpufreq_conservative video tc1100_wmi
sbs sony_acpi pcc_acpi i2c_ec hotkey dock dev_acpi button battery
container ac asus_acpi deflate zlib_deflate twofish serpent aes blowfish
des sha256 sha1 crypto_null af_key af_packet dm_mod md_mod visor
usbserial parport_pc lp parport pcmcia sr_mod cdrom yenta_socket joydev
rsrc_nonstatic pcmcia_core nvidia snd_hda_intel snd_hda_codec tsdev
hci_usb sg snd_pcm_oss snd_mixer_oss bluetooth snd_pcm snd_timer evdev
tg3 serio_raw i2c_core intel_agp pcspkr psmouse snd soundcore
snd_page_alloc agpgart rtc shpchp pci_hotplug xt_tcpudp xt_state
iptable_filter ipt_MASQUERADE iptable_nat ip_nat ip_conntrack nfnetlink
ip_tables x_tables ext3 jbd ehci_hcd uhci_hcd usbcore ide_generic sd_mod
generic ata_piix libata scsi_mod thermal processor fan capability
commoncap vesafb fbcon tileblit font bitblit softcursor
---
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Erik Andersen
On Thu Dec 14, 2006 at 11:23:11AM +0200, Muli Ben-Yehuda wrote:
  I just realized that booting with iommu=soft makes my pcHDTV
  HD5500 DVB cards not work.  Time to go back to disabling the
  memhole and losing 1 GB.  :-(
 
 That points to a bug in the driver (likely) or swiotlb (unlikely), as
 the IOMMU in use should be transparent to the driver. Which driver is
 it?

presumably one of cx88xx, cx88_blackbird, cx8800, cx88_dvb,
cx8802, cx88_alsa, lgdt330x, tuner, cx2341x, btcx_risc,
video_buf, video_buf_dvb, tveeprom, or dvb_pll.  It seems
to take an amazing number of drivers to make these devices
actually work...

 -Erik

--
Erik B. Andersen http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Executability of the stack

2006-12-14 Thread Arjan van de Ven
On Thu, 2006-12-14 at 10:26 +0100, Franck Pommereau wrote:
 Dear Linux developers,
 
 I recently discovered that the Linux kernel on 32 bits x86 processors
 reports the stack as being non-executable while it is actually
 executable (because located in the same memory segment).

this is not per se true, it depends on the capabilities of your 32 bit
x86 processor.


 # grep maps /proc/self/maps
 bfce8000-bfcfe000 rw-p bfce8000 00:00 0  [stack]

this shows that the *intent* is to have it non-executable. 
Not all x86 processors can enforce this. All modern ones do.

 Is there any reason for this situation? 

the alternative (showing effective permission) is equally confusing;
apps would see permissions they didn't set...

 Maybe it comes from sharing source code for 64 bits and 32 bits
 architectures but if so, it should be possible (and highly desirable) to
 treat 32 bits differently.

it's not a 32 bit thing, it's an older processors don't, newer ones
do thing.

Can you paste your /proc/cpuinfo file here ? Maybe you have a processor
with the capability but just haven't enabled it (either in the bios or
in the kernel config)

Greetings,
   Arjan van de Ven

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Thu, Dec 14, 2006 at 02:52:35AM -0700, Erik Andersen wrote:
 On Thu Dec 14, 2006 at 11:23:11AM +0200, Muli Ben-Yehuda wrote:
   I just realized that booting with iommu=soft makes my pcHDTV
   HD5500 DVB cards not work.  Time to go back to disabling the
   memhole and losing 1 GB.  :-(
  
  That points to a bug in the driver (likely) or swiotlb (unlikely), as
  the IOMMU in use should be transparent to the driver. Which driver is
  it?
 
 presumably one of cx88xx, cx88_blackbird, cx8800, cx88_dvb,
 cx8802, cx88_alsa, lgdt330x, tuner, cx2341x, btcx_risc,
 video_buf, video_buf_dvb, tveeprom, or dvb_pll.  It seems
 to take an amazing number of drivers to make these devices
 actually work...

Yikes! do you know which one actually handles the DMA mappings? I
suspect a missnig unmap or sync, which swiotlb requires to sync back
the bounce buffer with the driver's buffer.

Cheers,
Muli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.19.1-rt14-smp circular locking dependency

2006-12-14 Thread Mike Galbraith
Greetings,

Lockdep doesn't approve of cpufreq, and seemingly with cause... I had to
poke SysRq-O.

[ 1103.164377] Disabling non-boot CPUs ...
[ 1103.171094] stopped custom tracer.
[ 1103.174614] 
[ 1103.174618] ===
[ 1103.182692] [ INFO: possible circular locking dependency detected ]
[ 1103.189178] 2.6.19.1-rt14-smp #3
[ 1103.192564] ---
[ 1103.199062] s2ram/6643 is trying to acquire lock:
[ 1103.203976]  (cpu_bitmask_lock){--..}, at: [c104a085] 
lock_cpu_hotplug+0x22/0x6d
[ 1103.211988] 
[ 1103.211991] but task is already holding lock:
[ 1103.218112]  (workqueue_mutex){--..}, at: [c1038a2a] 
workqueue_cpu_callback+0x1c6/0x299
[ 1103.226702] 
[ 1103.226706] which lock already depends on the new lock.
[ 1103.226708] 
[ 1103.235197] 
[ 1103.235203] the existing dependency chain (in reverse order) is:
[ 1103.242992] 
[ 1103.242994] - #3 (workqueue_mutex){--..}:
[ 1103.248905][c1042c21] add_lock_to_list+0x39/0x91
[ 1103.254859][c10454fc] __lock_acquire+0xc65/0xd3a
[ 1103.260760][c104562e] lock_acquire+0x5d/0x79
[ 1103.266271][c13ec1b2] _mutex_lock+0x2b/0x38
[ 1103.271801][c1038516] __create_workqueue+0x5f/0x16c
[ 1103.278013][c131cf1f] cpufreq_governor_dbs+0x274/0x321
[ 1103.284429][c131ae69] __cpufreq_governor+0x22/0x15e
[ 1103.290652][c131b426] __cpufreq_set_policy+0xe6/0x135
[ 1103.296994][c131b990] store_scaling_governor+0xa8/0x1e8
[ 1103.303577][c131c335] store+0x37/0x4a
[ 1103.308517][c10c14a9] sysfs_write_file+0x87/0xc1
[ 1103.314442][c10801e8] vfs_write+0xa6/0x170
[ 1103.319795][c108089c] sys_write+0x3d/0x64
[ 1103.325060][c1003293] syscall_call+0x7/0xb
[ 1103.330450][b7cc9e0e] 0xb7cc9e0e
[ 1103.334948][] 0x
[ 1103.339453] 
[ 1103.339456] - #2 (dbs_mutex){--..}:
[ 1103.347070][c1042c21] add_lock_to_list+0x39/0x91
[ 1103.347089][c10454fc] __lock_acquire+0xc65/0xd3a
[ 1103.347098][c104562e] lock_acquire+0x5d/0x79
[ 1103.347105][c13ec1b2] _mutex_lock+0x2b/0x38
[ 1103.347115][c131cdba] cpufreq_governor_dbs+0x10f/0x321
[ 1103.347124][c131ae69] __cpufreq_governor+0x22/0x15e
[ 1103.347134][c131b426] __cpufreq_set_policy+0xe6/0x135
[ 1103.347142][c131b990] store_scaling_governor+0xa8/0x1e8
[ 1103.347151][c131c335] store+0x37/0x4a
[ 1103.347158][c10c14a9] sysfs_write_file+0x87/0xc1
[ 1103.347167][c10801e8] vfs_write+0xa6/0x170
[ 1103.347176][c108089c] sys_write+0x3d/0x64
[ 1103.347184][c1003293] syscall_call+0x7/0xb
[ 1103.347192][b7cc9e0e] 0xb7cc9e0e
[ 1103.347212][] 0x
[ 1103.347221] 
[ 1103.347222] - #1 (policy-lock){--..}:
[ 1103.347227][c1042c21] add_lock_to_list+0x39/0x91
[ 1103.347235][c10454fc] __lock_acquire+0xc65/0xd3a
[ 1103.347242][c104562e] lock_acquire+0x5d/0x79
[ 1103.347250][c13ec1b2] _mutex_lock+0x2b/0x38
[ 1103.347258][c131b854] cpufreq_set_policy+0x35/0x79
[ 1103.347266][c131c0f5] cpufreq_add_dev+0x2b4/0x451
[ 1103.347274][c126734f] sysdev_driver_register+0x59/0x96
[ 1103.347284][c131c582] cpufreq_register_driver+0x66/0xfc
[ 1103.347292][c1630df9] cpufreq_p4_init+0x3a/0x51
[ 1103.347301][c10004b1] init+0x128/0x3da
[ 1103.347308][c1003f1b] kernel_thread_helper+0x7/0x1c
[ 1103.347316][] 0x
[ 1103.347371] 
[ 1103.347372] - #0 (cpu_bitmask_lock){--..}:
[ 1103.347380][c1043846] print_circular_bug_tail+0x39/0x73
[ 1103.347389][c1045375] __lock_acquire+0xade/0xd3a
[ 1103.347397][c104562e] lock_acquire+0x5d/0x79
[ 1103.347404][c13ec1b2] _mutex_lock+0x2b/0x38
[ 1103.347412][c104a085] lock_cpu_hotplug+0x22/0x6d
[ 1103.347420][c131bc31] cpufreq_driver_target+0x27/0x5d
[ 1103.347429][c131c2d9] cpufreq_cpu_callback+0x47/0x6c
[ 1103.347437][c1034fd6] notifier_call_chain+0x2c/0x39
[ 1103.347446][c1034fff] raw_notifier_call_chain+0x8/0xa
[ 1103.347454][c1049dc5] _cpu_down+0x4c/0x25c
[ 1103.347463][c104a1b5] disable_nonboot_cpus+0x92/0x16d
[ 1103.347471][c104fc39] enter_state+0x72/0x1a6
[ 1103.347480][c104fe10] state_store+0xa3/0xac
[ 1103.347488][c10c1170] subsys_attr_store+0x20/0x25
[ 1103.347496][c10c14a9] sysfs_write_file+0x87/0xc1
[ 1103.347503][c10801e8] vfs_write+0xa6/0x170
[ 1103.347511][c108089c] sys_write+0x3d/0x64
[ 1103.347519][c1003293] syscall_call+0x7/0xb
[ 1103.347526][b7e7be0e] 0xb7e7be0e
[ 1103.347535][] 0x
[ 1103.347544] 
[ 1103.347545] other info that might help us debug this:
[ 1103.347546] 
[ 1103.347549] 2 locks held by s2ram/6643:
[ 1103.347551]  #0:  (cpu_add_remove_lock){--..}, at: [c104a136] 

[PATCH 2.6.19.1] m32r: Build fix for processors without ISA_DSP_LEVEL2

2006-12-14 Thread Hirokazu Takata
Additional fixes for processors without ISA_DSP_LEVEL2.
sigcontext_t does not have dummy_acc1h, dummy_acc1l members any longer.

This patch is against v2.6.19.1 kernel.

From: Hirokazu Takata [EMAIL PROTECTED]
Subject: [PATCH 2.6.19] m32r: Make userspace headers platform-independent
Date: Wed, 06 Dec 2006 19:00:01 +0900
 The m32r kernel 2.6.18-rc1 or after cause build errors of unknown isa
 configuration for userspace application programs, such as glibc, gdb, etc.
 
 This is because the recent kernel do not include linux/config.h not to
 expose kernel headers for userspace.
 
 To fix the above compile errors, this patch fixes two headers ptrace.h
 and sigcontext.h for m32r and makes them platform-independent.
 
 Signed-off-by: Hirokazu Takata [EMAIL PROTECTED]
 ---
  arch/m32r/kernel/entry.S  |   65 ++--
  include/asm-m32r/ptrace.h |   28 ++---
  include/asm-m32r/sigcontext.h |   13 +---
  3 files changed, 35 insertions(+), 71 deletions(-)
 

Signed-off-by: Hirokazu Takata [EMAIL PROTECTED]
---
 arch/m32r/kernel/process.c |2 +-
 arch/m32r/kernel/signal.c  |   26 --
 2 files changed, 5 insertions(+), 23 deletions(-)

diff --git a/arch/m32r/kernel/process.c b/arch/m32r/kernel/process.c
index 44cbe0c..a689e29 100644
--- a/arch/m32r/kernel/process.c
+++ b/arch/m32r/kernel/process.c
@@ -174,7 +174,7 @@ void show_regs(struct pt_regs * regs)
  regs-acc1h, regs-acc1l);
 #elif defined(CONFIG_ISA_M32R2) || defined(CONFIG_ISA_M32R)
printk(ACCH[%08lx]:ACCL[%08lx]\n, \
- regs-acch, regs-accl);
+ regs-acc0h, regs-acc0l);
 #else
 #error unknown isa configuration
 #endif
diff --git a/arch/m32r/kernel/signal.c b/arch/m32r/kernel/signal.c
index b60cea4..045f958 100644
--- a/arch/m32r/kernel/signal.c
+++ b/arch/m32r/kernel/signal.c
@@ -109,19 +109,10 @@ restore_sigcontext(struct pt_regs *regs, struct 
sigcontext __user *sc,
COPY(r10);
COPY(r11);
COPY(r12);
-#if defined(CONFIG_ISA_M32R2)  defined(CONFIG_ISA_DSP_LEVEL2)
COPY(acc0h);
COPY(acc0l);
-   COPY(acc1h);
-   COPY(acc1l);
-#elif defined(CONFIG_ISA_M32R2) || defined(CONFIG_ISA_M32R)
-   COPY(acch);
-   COPY(accl);
-   COPY(dummy_acc1h);
-   COPY(dummy_acc1l);
-#else
-#error unknown isa configuration
-#endif
+   COPY(acc1h);/* ISA_DSP_LEVEL2 only */
+   COPY(acc1l);/* ISA_DSP_LEVEL2 only */
COPY(psw);
COPY(bpc);
COPY(bbpsw);
@@ -196,19 +187,10 @@ setup_sigcontext(struct sigcontext __user *sc, struct 
pt_regs *regs,
COPY(r10);
COPY(r11);
COPY(r12);
-#if defined(CONFIG_ISA_M32R2)  defined(CONFIG_ISA_DSP_LEVEL2)
COPY(acc0h);
COPY(acc0l);
-   COPY(acc1h);
-   COPY(acc1l);
-#elif defined(CONFIG_ISA_M32R2) || defined(CONFIG_ISA_M32R)
-   COPY(acch);
-   COPY(accl);
-   COPY(dummy_acc1h);
-   COPY(dummy_acc1l);
-#else
-#error unknown isa configuration
-#endif
+   COPY(acc1h);/* ISA_DSP_LEVEL2 only */
+   COPY(acc1l);/* ISA_DSP_LEVEL2 only */
COPY(psw);
COPY(bpc);
COPY(bbpsw);
-- 
1.4.4.2

--
Hirokazu Takata [EMAIL PROTECTED]
Linux/M32R Project:  http://www.linux-m32r.org/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19.1-rt14-smp circular locking dependency

2006-12-14 Thread Ingo Molnar

* Mike Galbraith [EMAIL PROTECTED] wrote:

 Greetings,
 
 Lockdep doesn't approve of cpufreq, and seemingly with cause... I had 
 to poke SysRq-O.

hm ... this must be an upstream problem too, right? -rt shouldnt change 
anything in this area (in theory).

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Userspace I/O driver core

2006-12-14 Thread Avi Kivity

Greg KH wrote:

A large number of people have expressed interest recently in the
userspace i/o driver core which allows userspace drivers to be written
to handle some types of hardware.

Right now the UIO core is working and in the -mm releases.  It's been
rewritten from the last time patches were posted to lkml and is much
simpler.  It also includes full documentation and two example drivers
and two example userspace programs that test those drivers.

But in order to get this core into the kernel tree, we need to have some
real drivers written that use it.  So, for anyone that wants to see
this go into the tree, now is the time to step forward and post your
patches for hardware that this kind of driver interface is needed.

  

[...]


If anyone has any questions on how to use this interface, or anything
else about it, please let me and Thomas know.

  


I understand one still has to write a kernel driver to shut up the irq.  
How about writing a small bytecode interpreter to make event than 
unnecessary?


The userspace driver would register a couple of bytecode programs: 
is_interrupt_pending() and disable_interrupt(), which the uio framework 
would call when the interrupt fires.


The bytecode could reuse net/core/filter.c, with the packet replaced by 
the mmio or ioregion, or use something new.


--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: realtime-preempt and arm

2006-12-14 Thread Ingo Molnar

* tike64 [EMAIL PROTECTED] wrote:

 Steven Rostedt [EMAIL PROTECTED] wrote:
  Also, have you tried this with a nanosleep instead of a select.
  Select's timeout is just that, a timeout. It's not suppose to be
  accurate, as long as it doesn't expire early.  The reason I state
  this, is that select uses a different mechanism than nanosleep, and
  that can indeed affect the jitter.
 
 Ok, understood; I tried this:
 
   t = raw_timer();
   ts.tv_nsec = 500;
   ts.tv_sec = 0;
   nanosleep(ts, 0);
   t = raw_timer() - t;
 
 It is better but I still see 8ms occasional delays when listing 
 nfs-mounted directories onto FB. And, what is funny, also this version 
 makes the average delay 20ms as if it made the jiffy 20ms.

ARM has no high resolution timers support yet in the -rt tree.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19.1-rt14-smp circular locking dependency

2006-12-14 Thread Mike Galbraith
On Thu, 2006-12-14 at 10:59 +0100, Ingo Molnar wrote:
 * Mike Galbraith [EMAIL PROTECTED] wrote:
 
  Greetings,
  
  Lockdep doesn't approve of cpufreq, and seemingly with cause... I had 
  to poke SysRq-O.
 
 hm ... this must be an upstream problem too, right? -rt shouldnt change 
 anything in this area (in theory).

I'll find out in a few.. enabling lockdep / compiling 2.5.19.1.

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [stable] [PATCH 46/61] fix Intel RNG detection

2006-12-14 Thread Jan Beulich
 dean gaudet [EMAIL PROTECTED] 14.12.06 09:40 
On Thu, 14 Dec 2006, Jan Beulich wrote:

 with the patch it boots perfectly without any command-line args.
 
 Are you getting the 'Firmware space is locked read-only' message then?

yep...

so let me ask a naive question... don't we want the firmware locked 
read-only because that protects the bios from viruses?  honestly i'm naive 
in this area of pc hardware, but i'm kind of confused why we'd want 
unlocked firmware just so we can detect a RNG.

Indeed, these are contradicting requirements. The RNG detection, as
outlined by Intel documentation, requires being able to write to firmware
hub space (which in turn is hidden behind BIOS space). But I agree that
this is not a good solution (and even without that, it is not good to
require temporarily making invisible the entire BIOS code/data in order
to detect a non-essential device like an RNG).

Jan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] more Driver core patches for 2.6.19

2006-12-14 Thread Hans-Jürgen Koch
Am Donnerstag, 14. Dezember 2006 10:30 schrieb Muli Ben-Yehuda:
 On Wed, Dec 13, 2006 at 10:15:47PM +0100, Arjan van de Ven wrote:
 
  with DRI you have the case where something needs to do security
  validation of the commands that are sent to the card. (to avoid a
  non-privileged user to DMA all over your memory)
 
 We also have the interesting case where your card is behind an
 isolation-capable IOMMU, so if you let userspace program it, you need
 a userspace-accessible DMA-API for IOMMU mappings (or to pre-map
 everything in the IOMMU, which loses on some of the benefits of
 isolation-capable IOMMUs (i.e., only map what you need to use right
 now)).

Userspace IO (UIO) was never intended to replace all kinds of possible
drivers. We wanted to address the situation where a manufacturer of
industrial I/O cards wants to do a large part of his driver in userspace
to simplify his development process. That's all.
Most of these I/O cards have registers or dual ported RAM that can be
mapped to userspace. This is possible with a standard kernel and is done
every day. Problem is that you can't handle interrupts. UIO simply adds
this capability and offers a standardized interface.
The code Greg added to his tree can do this for most hardware found
on industrial IO boards. That's all we wanted to achieve for now. If 
somebody wants to support more sophisticated things, suggestions are
welcome.

Hans

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Userspace I/O driver core

2006-12-14 Thread Arjan van de Ven

 I understand one still has to write a kernel driver to shut up the irq.  
 How about writing a small bytecode interpreter to make event than 
 unnecessary?

if you do that why not do a real driver.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


WARNING at fs/inotify.c:181

2006-12-14 Thread Rolf Eike Beer
Version is 2.6.19-rc6-git. System was more or less idle, just normal desktop 
stuff (copying single files by scp, writing mail). Don't know what exactly 
was working when this happened, I saw it some minutes later.

BUG: warning at fs/inotify.c:181/set_dentry_child_flags()
 [c017da03] set_dentry_child_flags+0xcf/0x11c
 [c017daa3] remove_watch_no_event+0x53/0x5f
 [c017db91] inotify_remove_watch_locked+0x12/0x3e
 [c02896d7] mutex_lock+0x1a/0x29
 [c017de59] inotify_rm_wd+0x6d/0x8a
 [c017e322] sys_inotify_rm_watch+0x38/0x4f
 [c0102dcb] syscall_call+0x7/0xb
 [c028007b] xfrm_policy_flush+0x10e/0x180
 ===

Greetings,

Eike


pgpSRchqACMZP.pgp
Description: PGP signature


Re: Userspace I/O driver core

2006-12-14 Thread Hans-Jürgen Koch
Am Donnerstag, 14. Dezember 2006 10:44 schrieb Avi Kivity:

 
 I understand one still has to write a kernel driver to shut up the irq.  
 How about writing a small bytecode interpreter to make event than 
 unnecessary?
 
 The userspace driver would register a couple of bytecode programs: 
 is_interrupt_pending() and disable_interrupt(), which the uio framework 
 would call when the interrupt fires.
 
 The bytecode could reuse net/core/filter.c, with the packet replaced by 
 the mmio or ioregion, or use something new.
 

I think this would be overkill. The kernel module you have to write
is _really_ very simple. And it has to be written only once, so even
a manufacturer who employs no experienced kernel developers can
easily outsource that task.

Hans

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/1] Char: isicom, remove tty_{hang,wake}up bottomhalves

2006-12-14 Thread Alan
On Thu, 14 Dec 2006 01:35:17 +0100 (CET)
Jiri Slaby [EMAIL PROTECTED] wrote:

 isicom, remove tty_{hang,wake}up bottomhalves
 
 - tty_hangup() itself schedules work, so there is no need to schedule hangup
   in the driver
 - tty_wakeup() its safe to call it while in atomic (IS THIS CORRECT?), so that
   its schedule_work might be also wiped out
 
 Signed-off-by: Jiri Slaby [EMAIL PROTECTED]
 

Acked-by: Alan Cox [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread Alan
 2008?  I bet a lot of people would read the above to say that their
 system will just drop dead of a New Year's hangover, and they'll freak.
 I wouldn't want to be the one getting all the email at that point...

I wouldn't worry. Everyone will have patched it back out again by then,
or made their driver lie about the license. Both of which make the
problem worse not better when it comes to debugging.

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: realtime-preempt and arm

2006-12-14 Thread tike64
Ingo Molnar [EMAIL PROTECTED] wrote:
 tike64 [EMAIL PROTECTED] wrote:
  Ok, understood; I tried this:
  
  t = raw_timer();
  ts.tv_nsec = 500;
  ts.tv_sec = 0;
  nanosleep(ts, 0);
  t = raw_timer() - t;
  
  It is better but I still see 8ms occasional delays when listing 
  nfs-mounted directories onto FB. And, what is funny, also this
  version makes the average delay 20ms as if it made the jiffy 20ms.
 
 ARM has no high resolution timers support yet in the -rt tree.

Yes, but is there a reason why the -rt patch seems to make the 10ms
jiffy 20ms and why the jitter is so high. I don't need high resolution
but reasonable, a couple of milliseconds, jitter.

--

tike



 

Cheap talk?
Check out Yahoo! Messenger's low PC-to-Phone call rates.
http://voice.yahoo.com
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] more Driver core patches for 2.6.19

2006-12-14 Thread Hans-Jürgen Koch
Am Donnerstag, 14. Dezember 2006 09:49 schrieb Duncan Sands:
  I'm really not convinced about the user-mode thing unless somebody can 
  show me a good reason for it. Not just some wouldn't it be nice kind of 
  thing. A real, honest-to-goodness reason that we actually _want_ to see 
  used.
 
 Qemu?  It would be nice if emulators could directly drive hardware:
 useful for reverse engineering windows drivers for example.

I really think there is a big misunderstanding in this whole discussion.
Userspace IO (UIO) was never intended to be a generic userspace 
interface to all kinds of hardware. I completely agree with Linus and all
others who expressed that opinion that a full-fledged kernel module is the,
let's say, most beautiful way of writing a driver. But it's not always the
best way. Let's look at a real world example:

A small German manufacturer produces high-end AD converter cards. He sells
100 pieces per year, only in Germany and only with Windows drivers. He would
now like to make his cards work with Linux. He has two driver programmers
with little experience in writing Linux kernel drivers. What do you tell him?
Write a large kernel module from scratch? Completely rewrite his code 
because it uses floating point arithmetics?

And even if they did that, do we really want it? Do we want to add large
kernel modules for each exotic card? With UIO, everything becomes much cleaner:

* They let somebody write the small kernel module they need to handle 
interrupts in a _clean_ way. This module can easily be checked and could
even be included in a mainline kernel.

* They do the rest in userspace, with all the libraries and tools they're
used to. That's what they _can_ do.

Note that this is a _technical_ reason. I'm not talking about all that
licensing possibilities that were discussed here.

UIO's intention is to allow manufacturers of industrial IO hardware to
support Linux without the need to hire half a dozen experienced kernel
developers. It makes their kernels more stable and easier to maintain.
We don't get flooded with requests to include large modules for exotic
hardware into the mainline kernel. 

The alternative is what we have at the moment: Manufacturers don't support
Linux at all, because it's too difficult to handle for them, or they do
it in a way that either violates our licence or leads to unstable 
customized kernels (or both).

So, your qemu suggestion is certainly interesting. But, really, we never
thought of such a general thing while we were working at that code.
I thought I had to make that clear. Reading this thread, one could get
the impression we wanted to start a revolution and handle all hardware
in userspace from now on. This is definetly wrong.

Hans

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Userspace I/O driver core

2006-12-14 Thread Alan
 But in order to get this core into the kernel tree, we need to have some
 real drivers written that use it.  So, for anyone that wants to see
 this go into the tree, now is the time to step forward and post your
 patches for hardware that this kind of driver interface is needed.

Might be kind of hairy given uio_read() doesn't even return from the
kernel. This code simply isn't fit for purpose, philosophical debate
aside.

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Userspace I/O driver core

2006-12-14 Thread Avi Kivity

Arjan van de Ven wrote:
I understand one still has to write a kernel driver to shut up the irq.  
How about writing a small bytecode interpreter to make event than 
unnecessary?



if you do that why not do a real driver.

  


An entire driver in bytecode? that means exposing the entire kernel API 
to the bytecode interpreter.  A monumental task.


Or did I misunderstand you?


--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] EDAC: Fix in e752x mc driver

2006-12-14 Thread Alan
On Wed, 13 Dec 2006 17:17:45 -0800 (PST)
Doug Thompson [EMAIL PROTECTED] wrote:

 From: Mike Chan [EMAIL PROTECTED]
 
 Diff against 2.6.19
 
 This fix/change returns the offset into the page for
 the ce/ue error, instead of just 0. The e752x dram controller reads
 34:6 of the
 linear address with the error.
 
 Mike Chan
 
 Signed-off-by: Mike Chan [EMAIL PROTECTED]
 Signed-off-by: doug thompson [EMAIL PROTECTED]

Acked-by: Alan Cox [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] EDAC: Add Fully-Buffered DIMM APIs to core

2006-12-14 Thread Alan

 +void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
 + unsigned int csrow,
 + unsigned int channela,
 + unsigned int channelb,
 + char *msg)
 +{
 + int len = EDAC_MC_LABEL_LEN * 4;
 + char labels[len + 1];

Have you checked gcc generates the right code from this and not a dynamic
allocation. I'm not sure if you want const int len to force that ?


Otherwise looks ok
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Userspace I/O driver core

2006-12-14 Thread Avi Kivity

[why trim the cc?]

Hans-Jürgen Koch wrote:

Am Donnerstag, 14. Dezember 2006 10:44 schrieb Avi Kivity:

  
I understand one still has to write a kernel driver to shut up the irq.  
How about writing a small bytecode interpreter to make event than 
unnecessary?


The userspace driver would register a couple of bytecode programs: 
is_interrupt_pending() and disable_interrupt(), which the uio framework 
would call when the interrupt fires.


The bytecode could reuse net/core/filter.c, with the packet replaced by 
the mmio or ioregion, or use something new.





I think this would be overkill. The kernel module you have to write
is _really_ very simple. And it has to be written only once, so even
a manufacturer who employs no experienced kernel developers can
easily outsource that task.

  


It has to be written once, but compiled for every kernel version and 
$arch out there (for out of tree drivers), or it has to wait for the 
next kernel release and distro sync (for in-tree drivers).


If we make userspace drivers possible, it makes sense that the entire 
driver be in userspace, not just 98.7% of it.




--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] EDAC: Add memory scrubbing controls API to core

2006-12-14 Thread Alan
On Wed, 13 Dec 2006 17:18:53 -0800 (PST)
Doug Thompson [EMAIL PROTECTED] wrote:

 From: Frithiof Jensen [EMAIL PROTECTED]
 
  This patch is meant for Kernel version 2.6.19
  
  This is an attempt of providing an interface for memory 
  scrubbing control in EDAC.

 Signed-off-by: Frithiof Jensen [EMAIL PROTECTED]
 Signed-off-by: doug thompson [EMAIL PROTECTED]

Acked-by: Alan Cox [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Userspace I/O driver core

2006-12-14 Thread Arjan van de Ven
On Thu, 2006-12-14 at 12:46 +0200, Avi Kivity wrote:
 Arjan van de Ven wrote:
  I understand one still has to write a kernel driver to shut up the irq.  
  How about writing a small bytecode interpreter to make event than 
  unnecessary?
  
 
  if you do that why not do a real driver.
 

 
 An entire driver in bytecode? 

no a real, non-bytecode driver.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Userspace I/O driver core

2006-12-14 Thread Avi Kivity

Arjan van de Ven wrote:

On Thu, 2006-12-14 at 12:46 +0200, Avi Kivity wrote:
  

Arjan van de Ven wrote:

I understand one still has to write a kernel driver to shut up the irq.  
How about writing a small bytecode interpreter to make event than 
unnecessary?



if you do that why not do a real driver.

  
  
An entire driver in bytecode? 



no a real, non-bytecode driver.

  


Isn't the whole point of uio is to avoid writing a kernel mode driver?

As proposed, it doesn't quite accomplish it.  With an additional 
bytecode interpreter, you can have a 100% userspace driver (the bytecode 
interpreter would be part of uio, not the driver).



--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Need to enable caches in SMP ? (was Kernel 2.6 SMP very slow with ServerWorks LE Chipset)

2006-12-14 Thread Alan
 As per Alan's suggestion I decompressed the kernel source tree with the 
 processes pegged to one CPU then the other, and as he predicted it took 
 vastly longer on one CPU than the other, but I don't know what that 
 implies, or how to fix it.

From the timing it sounds like one processor cache is disabled which is a
little peculiar to say the least.

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread Alan
On Wed, 13 Dec 2006 22:01:15 -0800
Hua Zhong [EMAIL PROTECTED] wrote:

  I think allowing binary hardware drivers in userspace hurts 
  our ability to leverage companies to release hardware specs. 
 
 If filesystems can be in user space, why can't drivers be in user space? On 
 what *technical* ground?

The FUSE file system interface provides a clean disciplined interface
which allows an fs to live in user space. The uio layer (if its ever
fixed and cleaned up) provides some basic hooks that allow a user space
program to arbitarily control hardware and make a nasty undebuggable mess.

uio also doesn't handle hotplug, pci and other small matters.

Now if you wanted to make uio useful at minimum you would need

-  PCI support
-  The ability to mark sets of I/O addresses for the card as
unmappable, read only, read-write, any read/root write, root
read/write
-  A proper IRQ handler
-  A DMA interface
-  The ability to describe sharing rules

Which actually is a description of the core of the DRM layer.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Userspace I/O driver core

2006-12-14 Thread Thomas Gleixner
On Thu, 2006-12-14 at 10:52 +, Alan wrote:
 Might be kind of hairy given uio_read() doesn't even return from the
 kernel. 

We probably talk about different code here, right ? The one, I'm looking
at returns on each interrupt event.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread Alan
On Thu, 14 Dec 2006 08:21:20 +
David Woodhouse [EMAIL PROTECTED] wrote:

 If they fail to do that under the 'honour system' then I'm not averse to
 'enforcing' it by technical measures. (For some value of 'enforcement'
 which is easy for them to patch out if their lawyers are _really_ sure
 they'll win when I sue them, that is.)

There are specific rules against removal of technical measures *even if
the result is legal*. It is an offence in many countries thanks to the
RIAA lobbyists and their corrupt pet politicians to remove technical
measures applied to a -public domain- work.

So your argument doesn't fly.

 Not on my part. The thing that makes me _particularly_ vehement about
 binary-only crap this week is a very much a technical issue -- in
 particular, the fact that we had to do post-production board
 modifications to shoot our wireless chip in the head when it goes AWOL,
 because the code for it wasn't available to us.

Consider it an education process. Hopefully the contracts for the
chips/docs were watertight enough you can sue the offending supplier for
the total cost of the rework. If not then you are really complaining
about getting contract negotiations wrong.

 It's better to have a coherent approach, and for all of us to do it on
 roughly the same timescale. Getting the distributions do so this is
 going to be like herding cats -- having it upstream and letting it
 trickle down is a much better approach, I think.

I doubt any distribution but the FSF purified Debian (the one that has
no firmware so doesn't work) would do it.

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] more Driver core patches for 2.6.19

2006-12-14 Thread Jan Engelhardt

 For the sharing case, some sort of softirq should be created. That is, when a
 hard interrupt is generated and the irq handler is executed, set a flag that 
 at
 some other point in time, the irq is delivered to userspace. Like you do with
 signals in userspace:

NO.

The whole point is, YOU CANNOT DO THIS.

You need to shut the device up. Otherwise it keeps screaming.

Please, people, don't confuse the issue any further. A hardware driver

   ABSOLUTELY POSITIVELY HAS TO

have an in-kernel irq handler that knows how to turn the irq off.

End of story. No ifs, buts, maybes about it.

I don't get you. The rtc module does something similar (RTC generates
interrupts and notifies userspace about it)


  irqreturn_t uio_handler(...) {
  disable interrupts for this dev;
  set a flag that notifies userspace the next best time;
  seomstruct-flag |= IRQ_ARRIVED;
  return IRQ_HANDLED;
  }


  /* Userspace-kernel notification, e.g. by means of a device node
 /dev/uio or some ioctl. */
  int uio_write(...) {
  somestruct-flag = ~IRQ_ARRIVED;
  enable interrupts for the device;
  }



 - have an in-kernel irq handler that at a minimum knows how to test 
   whether the irq came from that device and knows how to shut it up.

This means NOT A GENERIC DRIVER. That simply isn't an option on the 
table, no matter how much people would like it to be.


-`J'
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] more Driver core patches for 2.6.19

2006-12-14 Thread Alan
  IRQ is shared with the disk driver, box dead.
 
 Err ? 
 
 IRQ happens
 
 IRQ is disabled by the generic handling code
 
 Handler is invoked and checks, whether the irq is from the device or
 not. 
  - If not, it returns IRQ_NONE, so the next driver (e.g. disk) is
 invoked.
  - If yes, it masks the chip on the device, which disables the chip
 interrupt line and returns IRQ_HANDLED.
 
 In both cases the IRQ gets reenabled from the generic irq handling code
 on return, so why is the box dead ?

I wrote this before your generic layer was in fact explained further to
not be generic at all and involve a new driver for each device. Your
original explanation was not clear.

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] more Driver core patches for 2.6.19

2006-12-14 Thread Jan Engelhardt

  irqreturn_t uio_handler(...) {
  disable interrupts for this dev;
  set a flag that notifies userspace the next best time;
  seomstruct-flag |= IRQ_ARRIVED;
  return IRQ_HANDLED;
  }

Rather than IRQ_HANDLED, it should have been: remove this irq handler 
from the irq handlers for irq number N, so that it does not get called 
again until userspace has acked it.

See, maybe I don't have enough clue yet to exactly figure out why you 
say it does not work. However, this is how simple people see it 8)


-`J'
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread Hans-Jürgen Koch
Am Donnerstag, 14. Dezember 2006 12:14 schrieb Alan:
 On Wed, 13 Dec 2006 22:01:15 -0800
 Hua Zhong [EMAIL PROTECTED] wrote:
 
   I think allowing binary hardware drivers in userspace hurts 
   our ability to leverage companies to release hardware specs. 
  
  If filesystems can be in user space, why can't drivers be in user space? On 
  what *technical* ground?
 
 The FUSE file system interface provides a clean disciplined interface
 which allows an fs to live in user space. The uio layer (if its ever
 fixed and cleaned up) provides some basic hooks that allow a user space
 program to arbitarily control hardware and make a nasty undebuggable mess.

You think it's easier for a manufacturer of industrial IO cards to
debug a (large) kernel module?

 
 uio also doesn't handle hotplug, pci and other small matters.

uio is supposed to be a very thin layer. Hotplug and PCI are already
handled by other subsystems. 

 
 Now if you wanted to make uio useful at minimum you would need
 

The majority of industrial IO cards have registers and/or dual port RAM
that can be mapped to user space (even today). We want to add a simple
way to handle interrupts for such cards. That's all.
The fact that there might be some sort of hardware/interrupts/situations
where this is not possible or not so simple isn't that important at the
moment. We can extend the UIO system if somebody actually requires these
extensions.

Hans

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Userspace I/O driver core

2006-12-14 Thread Alan
On Thu, 14 Dec 2006 12:22:16 +0100
Thomas Gleixner [EMAIL PROTECTED] wrote:

 On Thu, 2006-12-14 at 10:52 +, Alan wrote:
  Might be kind of hairy given uio_read() doesn't even return from the
  kernel. 
 
 We probably talk about different code here, right ? The one, I'm looking
 at returns on each interrupt event.

The patch Greg posted up has no return inside the while loop.

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Karsten Weiss
On Thu, 14 Dec 2006, Muli Ben-Yehuda wrote:

 On Wed, Dec 13, 2006 at 09:34:16PM +0100, Karsten Weiss wrote:
 
  BTW: It would be really great if this area of the kernel would get some 
  more and better documentation. The information at 
  linux-2.6/Documentation/x86_64/boot_options.txt is very terse. I had to 
  read the code to get a *rough* idea what all the iommu= options 
  actually do and how they interact.
 
 Patches happily accepted :-)

Well, you asked for it. :-) So here's my little contribution. Please 
*double* *check*!

(BTW: I would like to know what DAC and SAC means in this context)

===

From: Karsten Weiss [EMAIL PROTECTED]

Patch summary:

- Better explanation of some of the iommu kernel parameter options.
- 32MBorder instead of 32MB^order.
- Mention the default order.
- SWIOTLB config help text
- removed the duplication of the iommu kernel parameter documentation.
- mention Documentation/x86_64/boot-options.txt in 
  Documentation/kernel-parameters.txt
- list the four existing PCI DMA mapping implementations of arch x86_64

Signed-off-by: Karsten Weiss [EMAIL PROTECTED]

---

--- linux-2.6.19/arch/x86_64/kernel/pci-dma.c.original  2006-12-14 
11:15:38.348598021 +0100
+++ linux-2.6.19/arch/x86_64/kernel/pci-dma.c   2006-12-14 12:14:48.176967312 
+0100
@@ -223,30 +223,10 @@
 }
 EXPORT_SYMBOL(dma_set_mask);
 
-/* 
iommu=[size][,noagp][,off][,force][,noforce][,leak][,memaper[=order]][,merge]
- [,forcesac][,fullflush][,nomerge][,biomerge]
-   size  set size of iommu (in bytes)
-   noagp don't initialize the AGP driver and use full aperture.
-   off   don't use the IOMMU
-   leak  turn on simple iommu leak tracing (only when CONFIG_IOMMU_LEAK is on)
-   memaper[=order] allocate an own aperture over RAM with size 32MB^order.
-   noforce don't force IOMMU usage. Default.
-   force  Force IOMMU.
-   merge  Do lazy merging. This may improve performance on some block devices.
-  Implies force (experimental)
-   biomerge Do merging at the BIO layer. This is more efficient than merge,
-but should be only done with very big IOMMUs. Implies merge,force.
-   nomerge Don't do SG merging.
-   forcesac For SAC mode for masks 40bits  (experimental)
-   fullflush Flush IOMMU on each allocation (default)
-   nofullflush Don't use IOMMU fullflush
-   allowed  overwrite iommu off workarounds for specific chipsets.
-   soft Use software bounce buffering (default for Intel machines)
-   noaperture Don't touch the aperture for AGP.
-   allowdac Allow DMA 4GB
-   nodacForbid DMA 4GB
-   panicForce panic when IOMMU overflows
-*/
+/*
+ * See Documentation/x86_64/boot-options.txt for the iommu kernel parameter
+ * documentation.
+ */
 __init int iommu_setup(char *p)
 {
iommu_merge = 1;
--- linux-2.6.19/arch/x86_64/Kconfig.original   2006-12-14 11:37:35.832142506 
+0100
+++ linux-2.6.19/arch/x86_64/Kconfig2006-12-14 11:47:24.346056710 +0100
@@ -431,8 +431,8 @@
  on systems with more than 3GB. This is usually needed for USB,
  sound, many IDE/SATA chipsets and some other devices.
  Provides a driver for the AMD Athlon64/Opteron/Turion/Sempron GART
- based IOMMU and a software bounce buffer based IOMMU used on Intel
- systems and as fallback.
+ based hardware IOMMU and a software bounce buffer based IOMMU used
+ on Intel systems and as fallback.
  The code is only active when needed (enough memory and limited
  device) unless CONFIG_IOMMU_DEBUG or iommu=force is specified
  too.
@@ -458,6 +458,11 @@
 # need this always selected by IOMMU for the VIA workaround
 config SWIOTLB
bool
+   help
+ Support for a software bounce buffer based IOMMU used on Intel
+ systems which don't have a hardware IOMMU. Using this code
+ PCI devices with 32bit memory access only are able to be
+ used on systems with more than 3 GB.
 
 config X86_MCE
bool Machine check support if EMBEDDED
--- linux-2.6.19/Documentation/x86_64/boot-options.txt.original 2006-12-14 
11:11:32.099300994 +0100
+++ linux-2.6.19/Documentation/x86_64/boot-options.txt  2006-12-14 
12:10:24.028009890 +0100
@@ -180,35 +180,66 @@
   pci=lastbus=NUMBER  Scan upto NUMBER busses, no matter what the 
mptable says.
   pci=noacpi   Don't use ACPI to set up PCI interrupt routing.
 
-IOMMU
+IOMMU (input/output memory management unit)
+
+ Currently four x86_64 PCI DMA mapping implementations exist:
+
+   1. arch/x86_64/kernel/pci-nommu.c: use no hardware/software IOMMU at all
+  (e.g. because you have  3 GB memory).
+  Kernel boot message: PCI-DMA: Disabling IOMMU
+
+   2. arch/x86_64/kernel/pci-gart.c: AMD GART based hardware IOMMU.
+  Kernel boot message: PCI-DMA: using GART IOMMU
+
+   3. arch/x86_64/kernel/pci-swiotlb.c : Software IOMMU implementation. Used
+  e.g. if there is no hardware IOMMU in the system and it is need because
+  you have 3GB memory or 

Re: Userspace I/O driver core

2006-12-14 Thread Hans-Jürgen Koch
Am Donnerstag, 14. Dezember 2006 12:39 schrieb Alan:
 On Thu, 14 Dec 2006 12:22:16 +0100
 Thomas Gleixner [EMAIL PROTECTED] wrote:
 
  On Thu, 2006-12-14 at 10:52 +, Alan wrote:
   Might be kind of hairy given uio_read() doesn't even return from the
   kernel. 
  
  We probably talk about different code here, right ? The one, I'm looking
  at returns on each interrupt event.
 
 The patch Greg posted up has no return inside the while loop.
 

There are three breaks in that while loop, the first makes it return as 
soon as an interrupt occurs.

Hans
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19.1-rt14-smp circular locking dependency

2006-12-14 Thread Mike Galbraith
On Thu, 2006-12-14 at 10:59 +0100, Ingo Molnar wrote: 
 * Mike Galbraith [EMAIL PROTECTED] wrote:
 
  Greetings,
  
  Lockdep doesn't approve of cpufreq, and seemingly with cause... I had 
  to poke SysRq-O.
 
 hm ... this must be an upstream problem too, right? -rt shouldnt change 
 anything in this area (in theory).

Yeah, it is.  It didn't seize up, but lockdep griped.  Trace from
2.6.19.1 below, cc added.

[  129.309689] Disabling non-boot CPUs ...
[  129.335627] 
[  129.335631] ===
[  129.343584] [ INFO: possible circular locking dependency detected ]
[  129.350028] 2.6.19.1-smp #77
[  129.352973] ---
[  129.359379] s2ram/6178 is trying to acquire lock:
[  129.364178]  (cpu_bitmask_lock){--..}, at: [c13e23dd] mutex_lock+0x8/0xa
[  129.371298] 
[  129.371300] but task is already holding lock:
[  129.377274]  (workqueue_mutex){--..}, at: [c13e23dd] mutex_lock+0x8/0xa
[  129.384277] 
[  129.384279] which lock already depends on the new lock.
[  129.384281] 
[  129.392647] 
[  129.392649] the existing dependency chain (in reverse order) is:
[  129.400294] 
[  129.400296] - #3 (workqueue_mutex){--..}:
[  129.406083][c103dd54] add_lock_to_list+0x3b/0x87
[  129.411895][c1040420] __lock_acquire+0xb75/0xc1a
[  129.417697][c10407f1] lock_acquire+0x5d/0x79
[  129.423135][c13e21ad] __mutex_lock_slowpath+0x6e/0x296
[  129.429470][c13e23dd] mutex_lock+0x8/0xa
[  129.434562][c1035815] __create_workqueue+0x5f/0x16c
[  129.440615][c1312a83] cpufreq_governor_dbs+0x2d6/0x32c
[  129.446943][c131073e] __cpufreq_governor+0x22/0x166
[  129.453009][c13112d9] __cpufreq_set_policy+0xe6/0x132
[  129.459267][c131153a] store_scaling_governor+0xa8/0x1e8
[  129.465676][c1310dbc] store+0x37/0x4a
[  129.470517][c10b743c] sysfs_write_file+0x8a/0xcb
[  129.476301][c1077bb8] vfs_write+0xa6/0x170
[  129.481584][c107826c] sys_write+0x3d/0x64
[  129.486761][c1003173] syscall_call+0x7/0xb
[  129.492018][b7bece0e] 0xb7bece0e
[  129.496389][] 0x
[  129.500789] 
[  129.500791] - #2 (dbs_mutex){--..}:
[  129.508253][c103dd54] add_lock_to_list+0x3b/0x87
[  129.516360][c1040420] __lock_acquire+0xb75/0xc1a
[  129.524405][c10407f1] lock_acquire+0x5d/0x79
[  129.532057][c13e21ad] __mutex_lock_slowpath+0x6e/0x296
[  129.540608][c13e23dd] mutex_lock+0x8/0xa
[  129.547856][c13128bc] cpufreq_governor_dbs+0x10f/0x32c
[  129.556348][c131073e] __cpufreq_governor+0x22/0x166
[  129.564548][c13112d9] __cpufreq_set_policy+0xe6/0x132
[  129.572865][c131153a] store_scaling_governor+0xa8/0x1e8
[  129.581379][c1310dbc] store+0x37/0x4a
[  129.588249][c10b743c] sysfs_write_file+0x8a/0xcb
[  129.596053][c1077bb8] vfs_write+0xa6/0x170
[  129.603290][c107826c] sys_write+0x3d/0x64
[  129.610398][c1003173] syscall_call+0x7/0xb
[  129.617624][b7bece0e] 0xb7bece0e
[  129.623954][] 0x
[  129.630230] 
[  129.630232] - #1 (policy-lock){--..}:
[  129.639563][c103dd54] add_lock_to_list+0x3b/0x87
[  129.647225][c1040420] __lock_acquire+0xb75/0xc1a
[  129.654928][c10407f1] lock_acquire+0x5d/0x79
[  129.662217][c13e21ad] __mutex_lock_slowpath+0x6e/0x296
[  129.670439][c13e23dd] mutex_lock+0x8/0xa
[  129.677387][c131144e] cpufreq_set_policy+0x35/0x79
[  129.685230][c1311a79] cpufreq_add_dev+0x2b8/0x461
[  129.692970][c1264128] sysdev_driver_register+0x63/0xaa
[  129.701152][c1311d58] cpufreq_register_driver+0x68/0xfd
[  129.709430][c1610cf9] cpufreq_p4_init+0x3a/0x51
[  129.717006][c100049b] init+0x112/0x311
[  129.723784][c1003dff] kernel_thread_helper+0x7/0x18
[  129.731709][] 0x
[  129.738040] 
[  129.738042] - #0 (cpu_bitmask_lock){--..}:
[  129.747694][c103f875] print_circular_bug_tail+0x30/0x66
[  129.756036][c1040231] __lock_acquire+0x986/0xc1a
[  129.763786][c10407f1] lock_acquire+0x5d/0x79
[  129.771202][c13e21ad] __mutex_lock_slowpath+0x6e/0x296
[  129.779450][c13e23dd] mutex_lock+0x8/0xa
[  129.786496][c1044326] lock_cpu_hotplug+0x22/0x82
[  129.794243][c131110b] cpufreq_driver_target+0x27/0x5d
[  129.802449][c1311c69] cpufreq_cpu_callback+0x47/0x6c
[  129.810548][c1032316] notifier_call_chain+0x2c/0x39
[  129.818555][c103233f] raw_notifier_call_chain+0x8/0xa
[  129.826752][c10440a9] _cpu_down+0x4c/0x219
[  129.833942][c1044483] disable_nonboot_cpus+0x92/0x14b
[  129.842105][c1049e2a] enter_state+0x7e/0x1bc
[  129.849530][c104a00b] state_store+0xa3/0xac
[  129.856813][c10b7110] subsys_attr_store+0x20/0x25
[  129.864627][c10b743c] 

Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread Xavier Bestel
On Wed, 2006-12-13 at 20:15 -0800, Linus Torvalds wrote:
 That said, I'm going to suggest that you people talk to your COMPANY 
 LAWYERS on this, and I'm personally not going to merge that particular 
 code unless you can convince the people you work for to merge it first.

That's quoting material :) Who's your master, by Linus.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Thu, Dec 14, 2006 at 12:38:08PM +0100, Karsten Weiss wrote:
 On Thu, 14 Dec 2006, Muli Ben-Yehuda wrote:
 
  On Wed, Dec 13, 2006 at 09:34:16PM +0100, Karsten Weiss wrote:
  
   BTW: It would be really great if this area of the kernel would get some 
   more and better documentation. The information at 
   linux-2.6/Documentation/x86_64/boot_options.txt is very terse. I had to 
   read the code to get a *rough* idea what all the iommu= options 
   actually do and how they interact.
  
  Patches happily accepted :-)
 
 Well, you asked for it. :-) So here's my little contribution. Please 
 *double* *check*!

Looks good, some nits below.

 (BTW: I would like to know what DAC and SAC means in this
 context)

Single / Double Address Cycle. DAC is used with 32-bit PCI to push a
64-bit address in two cycles.

 @@ -458,6 +458,11 @@
  # need this always selected by IOMMU for the VIA workaround
  config SWIOTLB
   bool
 + help
 +   Support for a software bounce buffer based IOMMU used on Intel
 +   systems which don't have a hardware IOMMU. Using this code
 +   PCI devices with 32bit memory access only are able to be
 +   used on systems with more than 3 GB.

I would rephrase as follows: Support for software bounce buffers used
on x86-64 systems which don't have a hardware IOMMU. Using this PCI
devices which can only access 32-bits of memory can be used on systems
with more than 3 GB of memory.

 +   size set size of IOMMU (in bytes)

Due to historical precedence, some of these options are only valid for
GART. Perhaps mention for each option which IOMMUs it is valid for or
split them on a per IOMMU basis?

This one (size) is gart only.

 +   noagpdon't initialize the AGP driver and use full
 aperture.

gart only.

 +   off  don't initialize and use any kind of IOMMU.

all.

 +   leak turn on simple iommu leak tracing (only when
 +CONFIG_IOMMU_LEAK is on)

gart only.

 +   memaper[=order]  allocate an own aperture over RAM with size 32MBorder.
 +(default: order=1, i.e. 64MB)

gart only.

 +   noforce  don't force hardware IOMMU usage when it is not needed.
 +(default).

all.

 +   forceForce the use of the hardware IOMMU even when it is
 +not actually needed (e.g. because  3 GB
  memory).

all.

 +   mergeDo scather-gather (SG) merging. Implies force
 (experimental)

gart only.

 +   nomerge  Don't do scather-gather (SG) merging.

gart only.

 +   forcesac For SAC mode for masks 40bits  (experimental)

gart only.

 +   fullflushFlush AMD GART based hardware IOMMU on each allocation
 +(default)

gart only.

 +   nofullflush  Don't use IOMMU fullflush

gart only.

 +   allowed  overwrite iommu off workarounds for specific
 chipsets.

gart only.

 +   soft Use software bounce buffering (SWIOTLB) (default for 
 Intel
 +machines). This can be used to prevent the usage
 +of a available hardware IOMMU.

all.

 +   noaperture   Ask the AMD GART based hardware IOMMU driver not to 
 +touch the aperture for AGP.

gart only.

 +   allowdac Allow DMA 4GB
 +When off all DMA over 4GB is forced through an IOMMU or
 +bounce buffering.

gart only.

 +   nodacForbid DMA 4GB

gart only.

 +   panicAlways panic when IOMMU overflows

gart and Calgary.

The rest looks good. Please resend and I'll add my Acked-by.

Cheers,
Muli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Executability of the stack

2006-12-14 Thread Franck Pommereau
 # grep maps /proc/self/maps
 bfce8000-bfcfe000 rw-p bfce8000 00:00 0  [stack]
 
 this shows that the *intent* is to have it non-executable. 
 Not all x86 processors can enforce this. All modern ones do.

Mine is quite recent:

# cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Core(TM)2 CPU T7200  @ 2.00GHz
stepping: 6
cpu MHz : 1000.000
cache size  : 4096 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm
constant_tsc up pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips: 3996.23

 the alternative (showing effective permission) is equally confusing;
 apps would see permissions they didn't set...

Indeed, both are confusing (the other way is having permission that you
do not see). But which one is the most dangerous from a security point
of view? For me it is believing that you're protected while you're not.

 Maybe it comes from sharing source code for 64 bits and 32 bits
 architectures but if so, it should be possible (and highly desirable) to
 treat 32 bits differently.
 
 it's not a 32 bit thing, it's an older processors don't, newer ones
 do thing.

I've read that 64 bit processors have an execute bit at the page level
while 32 bit ones do not (only at the segment level). I didn't know that
newer 31 bit CPUs did have this bit.

 Can you paste your /proc/cpuinfo file here ? Maybe you have a processor
 with the capability but just haven't enabled it (either in the bios or
 in the kernel config)

I'd be happy to know how to enable it.

Thanks for your help.
Franck
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] more Driver core patches for 2.6.19

2006-12-14 Thread Olivier Galibert
On Thu, Dec 14, 2006 at 10:56:03AM +0100, Hans-Jürgen Koch wrote:
 A small German manufacturer produces high-end AD converter cards. He sells
 100 pieces per year, only in Germany and only with Windows drivers. He would
 now like to make his cards work with Linux. He has two driver programmers
 with little experience in writing Linux kernel drivers. What do you tell him?
 Write a large kernel module from scratch? Completely rewrite his code 
 because it uses floating point arithmetics?

Write a small kernel module which:
- create a device node per-card
- read the data from the A/D as fast as possible and buffer it in main
  memory without touching it
- implements a read interface to read data from the buffer
- implement ioctls for whatever controls you need

And that's it.  All the rest can be done in userspace, safely, with
floating point, C++ and everything.  If the driver programmers are
worth their pay, their driver is probably already split logically at
where the userspace-kernel interface would be.

And small means small, like 200 lines or so, more if you want to have
fun with sysfs, poll, aio and their ilk, but that's not a necessity.

  OG.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Executability of the stack

2006-12-14 Thread Arjan van de Ven
On Thu, 2006-12-14 at 13:07 +0100, Franck Pommereau wrote:
  # grep maps /proc/self/maps
  bfce8000-bfcfe000 rw-p bfce8000 00:00 0  [stack]
  
  this shows that the *intent* is to have it non-executable. 
  Not all x86 processors can enforce this. All modern ones do.
 
 Mine is quite recent:

 mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm

the nx shows that if you configure your kernel correctly (enable PAE)
that you indeed have a non-executable capability, which will apply to
the stack (and afaik the heap too)

  the alternative (showing effective permission) is equally confusing;
  apps would see permissions they didn't set...
 
 Indeed, both are confusing (the other way is having permission that you
 do not see). But which one is the most dangerous from a security point
 of view? For me it is believing that you're protected while you're not.

it's debatable what the file means; the maps file shows software
permissions currently not hardware enforced permissions. The problem
is that if you show software permissions... it's harder to see the
kernels view (vma's etc). I don't think there's a perfect answer.

It gets even more complex if you have something like execshield in use;
where the stack and heap are non-executable, unless you get a higher
executable mapping. In that case, the appearance of such a higher
mapping would change the visual mapping of other mappings. Outright
confusing as well :)
 
  Maybe it comes from sharing source code for 64 bits and 32 bits
  architectures but if so, it should be possible (and highly desirable) to
  treat 32 bits differently.
  
  it's not a 32 bit thing, it's an older processors don't, newer ones
  do thing.
 
 I've read that 64 bit processors have an execute bit at the page level
 while 32 bit ones do not (only at the segment level). I didn't know that
 newer 31 bit CPUs did have this bit.

your cpu has this bit, you just didn't turn it on ;(

  Can you paste your /proc/cpuinfo file here ? Maybe you have a processor
  with the capability but just haven't enabled it (either in the bios or
  in the kernel config)
 
 I'd be happy to know how to enable it.

enable
CONFIG_HIGHMEM64G=y

and you're all set.


Greetings,
   Arjan van de Ven

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] slab: fix kmem_ptr_validate prototype

2006-12-14 Thread Peter Zijlstra
Some fallout of: 2e892f43ccb602e8ffad73396a1000f2040c9e0b

  CC  mm/slab.o
/usr/src/linux-2.6-git/mm/slab.c:3557: error: conflicting types for 
‘kmem_ptr_validate’
/usr/src/linux-2.6-git/include/linux/slab.h:58: error: previous declaration of 
‘kmem_ptr_validate’ was here


Signed-off-by: Peter Zijlstra [EMAIL PROTECTED]
---
 include/linux/slab.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6-git/include/linux/slab.h
===
--- linux-2.6-git.orig/include/linux/slab.h 2006-12-14 11:56:35.0 
+0100
+++ linux-2.6-git/include/linux/slab.h  2006-12-14 11:56:46.0 +0100
@@ -55,7 +55,7 @@ void *kmem_cache_zalloc(struct kmem_cach
 void kmem_cache_free(struct kmem_cache *, void *);
 unsigned int kmem_cache_size(struct kmem_cache *);
 const char *kmem_cache_name(struct kmem_cache *);
-int kmem_ptr_validate(struct kmem_cache *cachep, const void *ptr);
+int fastcall kmem_ptr_validate(struct kmem_cache *cachep, const void *ptr);
 
 #ifdef CONFIG_NUMA
 extern void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node);


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] more Driver core patches for 2.6.19

2006-12-14 Thread James Courtier-Dutton

Duncan Sands wrote:
I'm really not convinced about the user-mode thing unless somebody can 
show me a good reason for it. Not just some wouldn't it be nice kind of 
thing. A real, honest-to-goodness reason that we actually _want_ to see 
used.


Qemu?  It would be nice if emulators could directly drive hardware:
useful for reverse engineering windows drivers for example.

Duncan.


I have reverse engineered many windows drivers, and what you suggest is 
not at all helpful. For reverse engineering, one wants to see what is 
happening. I.e. capture all the IO, MMIO and DMA accesses.

Your suggestion bypasses this possibility.
For reverse engineering windows drivers, one puts breakpoints in the 
HAL.DLL code or replaces the HAL.DLL code with a debugging version of it 
 while actually running windows.


Your approach runs into problems.
e.g
There is a register on the card that sets the DMA base address, but you 
don't know which register this is. If you let the driver inside QEMU 
write to this register, it will write values suitable for the Virtual 
machine instead of values suitable to for host OS. The DMA transaction 
will write all over the wrong memory location resulting in CRASH.


One might be able to get round some of these problem with a combination 
of QEMU and a hacked up HAL.DLL, but it will be complicated.


James

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] more Driver core patches for 2.6.19

2006-12-14 Thread Alan
On Thu, 14 Dec 2006 10:56:03 +0100
Hans-Jürgen Koch [EMAIL PROTECTED] wrote:

 * They let somebody write the small kernel module they need to handle 
 interrupts in a _clean_ way. This module can easily be checked and could
 even be included in a mainline kernel.

And might as well do the mmap work as well. I'm not clear what uio gives
us that a decently written pair of PCI and platform template drivers for
people to use would not do more cleanly.

Also many of these cases you might not want stuff in userspace but the
uio model would push it that way which seems to be an unfortunate side
effect. Yes some probably do want to go that way but not all.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.19-git19] BUG due to bad argument to ieee80211softmac_assoc_work

2006-12-14 Thread Michael Bommarito

Ah, apologies, it's exam time and I probably didn't look hard enough
on the mailing list before posting.  For the record though, I'd posted
the bug (no patch) to bugzilla on the 9th (although it looks as if the
email address it was assigned to is actually defunct - anyone know why
bugzilla is still using [EMAIL PROTECTED] ?)
Anyway, again, sorry for the duplicate!

-Mike

On 12/14/06, Ray Lee [EMAIL PROTECTED] wrote:

On 12/13/06, Michael Bommarito [EMAIL PROTECTED] wrote:
 Sorry, realized I might not have been clear as to what I meant!  The
 patch was attached to the bugzilla entry, but I'll attach it here as
 well.  My description of the patch itself was really as complicated as
 it gets too (just two lines, switch (void*)mac to
 mac-assoc.work.work in
 net/ieee80211/softmac/ieee80211softmac_assoc.c), just a small bug
 while somebody was rushing through the work/delayed_work changes.

--- net/ieee80211/softmac/ieee80211softmac_assoc.c  2006-12-13
11:23:03.0 -0500
+++ net/ieee80211/softmac/ieee80211softmac_assoc.c  2006-12-13
11:24:26.0 -0500
@@ -167,7 +167,7 @@
 ieee80211softmac_assoc_notify_scan(struct net_device *dev, int
event_type, void *context)
 {
struct ieee80211softmac_device *mac = ieee80211_priv(dev);
-   ieee80211softmac_assoc_work((void*)mac);
+   ieee80211softmac_assoc_work(mac-associnfo.work.work);
 }

 static void
@@ -177,7 +177,7 @@

switch (event_type) {
case IEEE80211SOFTMAC_EVENT_AUTHENTICATED:
-   ieee80211softmac_assoc_work((void*)mac);
+   ieee80211softmac_assoc_work(mac-associnfo.work.work);
break;
case IEEE80211SOFTMAC_EVENT_AUTH_FAILED:
case IEEE80211SOFTMAC_EVENT_AUTH_TIMEOUT:

Good catch, though it was already caught. See:

   http://lkml.org/lkml/2006/12/12/46

...for (basically) the same patch.

But again, good catch :-).


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   >