[PATCH 0/6] various scheduler patches

2007-10-31 Thread Peter Zijlstra
My current scheduler queue, seems to work well on lappy -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

[PATCH 5/6] sched: SCHED_FIFO/SCHED_RR watchdog timer

2007-10-31 Thread Peter Zijlstra
Introduce a new rlimit that allows the user to set a runtime timeout on real-time tasks. Once this limit is exceeded the task will receive SIGXCPU. Input and ideas by Thomas Gleixner and Lennart Poettering. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] CC: Thomas Gleixner [EMAIL PROTECTED] CC

[PATCH 3/6] sched: high-res preemption tick

2007-10-31 Thread Peter Zijlstra
to minimize this by delivering preemption points spot-on. The average frequency of this extra interrupt is sched_latency / nr_latency. Which need not be higher than 1/HZ, its just that the distribution within the sched_latency period is important. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED

[PATCH 4/6] sched: sched_rt_entity

2007-10-31 Thread Peter Zijlstra
Move the task_struct members specific to rt scheduling together. A future optimization could be to put sched_entity and sched_rt_entity into a union. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] CC: Srivatsa Vaddagiri [EMAIL PROTECTED] --- include/linux/init_task.h |5 +++-- include/linux

[PATCH 1/6] sched: move the group scheduling primitives around

2007-10-31 Thread Peter Zijlstra
The next patch will make sched_slice group aware, reorder the group scheduling primitives so that they don't need fwd declarations. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] CC: Srivatsa Vaddagiri [EMAIL PROTECTED] --- kernel/sched_fair.c | 190

Re: [PATCH 3/6] sched: high-res preemption tick

2007-10-31 Thread Peter Zijlstra
On Wed, 2007-10-31 at 22:53 +0100, Andi Kleen wrote: Peter Zijlstra [EMAIL PROTECTED] writes: Use HR-timers (when available) to deliver an accurate preemption tick. The regular scheduler tick that runs at 1/HZ can be too coarse when nice level are used. The fairness system will still

Re: [PATCH 5/6] sched: SCHED_FIFO/SCHED_RR watchdog timer

2007-10-31 Thread Peter Zijlstra
On Wed, 2007-10-31 at 22:49 +0100, Andi Kleen wrote: Peter Zijlstra [EMAIL PROTECTED] writes: Introduce a new rlimit that allows the user to set a runtime timeout on real-time tasks. Once this limit is exceeded the task will receive SIGXCPU. Nice idea. It would be even nicer if you

Re: [PATCH 0/6] various scheduler patches

2007-11-01 Thread Peter Zijlstra
On Thu, 2007-11-01 at 09:29 +0100, Ingo Molnar wrote: * Peter Zijlstra [EMAIL PROTECTED] wrote: My current scheduler queue, seems to work well on lappy nice stuff! Both the hrtimers-tick feature and the rtlimit looks pretty good. Thanks! I'm wondering how well it works on SMP

Re: [PATCH 3/6] sched: high-res preemption tick

2007-11-01 Thread Peter Zijlstra
On Wed, 2007-10-31 at 22:53 +0100, Andi Kleen wrote: Peter Zijlstra [EMAIL PROTECTED] writes: Use HR-timers (when available) to deliver an accurate preemption tick. The regular scheduler tick that runs at 1/HZ can be too coarse when nice level are used. The fairness system will still

Re: [PATCH 2/6] sched: make sched_slice() group scheduling savvy

2007-11-01 Thread Peter Zijlstra
On Thu, 2007-11-01 at 17:01 +0530, Srivatsa Vaddagiri wrote: On Wed, Oct 31, 2007 at 10:10:32PM +0100, Peter Zijlstra wrote: Currently the ideal slice length does not take group scheduling into account. Change it so that it properly takes all the runnable tasks on this cpu into account

Re: [PATCH 2/6] sched: make sched_slice() group scheduling savvy

2007-11-01 Thread Peter Zijlstra
On Thu, 2007-11-01 at 12:58 +0100, Peter Zijlstra wrote: sched_slice() is about lantecy, its intended purpose is to ensure each task is ran exactly once during sched_period() - which is sysctl_sched_latency when nr_running = sysctl_sched_nr_latency, and otherwise linearly scales latency

Re: [PATCH 2/6] sched: make sched_slice() group scheduling savvy

2007-11-01 Thread Peter Zijlstra
On Thu, 2007-11-01 at 13:03 +0100, Peter Zijlstra wrote: On Thu, 2007-11-01 at 12:58 +0100, Peter Zijlstra wrote: sched_slice() is about lantecy, its intended purpose is to ensure each task is ran exactly once during sched_period() - which is sysctl_sched_latency when nr_running

Re: [PATCH 2/6] sched: make sched_slice() group scheduling savvy

2007-11-01 Thread Peter Zijlstra
On Thu, 2007-11-01 at 12:51 +0100, Peter Zijlstra wrote: On Thu, 2007-11-01 at 17:01 +0530, Srivatsa Vaddagiri wrote: On Wed, Oct 31, 2007 at 10:10:32PM +0100, Peter Zijlstra wrote: Currently the ideal slice length does not take group scheduling into account. Change it so

Re: [PATCH] nfs: fix nfs_writepage()

2007-10-17 Thread Peter Zijlstra
On Wed, 2007-10-17 at 18:26 +0200, Peter Zijlstra wrote: On Wed, 2007-10-17 at 12:24 -0400, Trond Myklebust wrote: On Wed, 2007-10-17 at 17:57 +0200, Peter Zijlstra wrote: On Wed, 2007-10-17 at 11:47 -0400, Trond Myklebust wrote: On Wed, 2007-10-17 at 11:45 -0400, Trond Myklebust wrote

Re: + reiserfs-fix-up-lockdep-warnings.patch added to -mm tree

2007-10-18 Thread Peter Zijlstra
[EMAIL PROTECTED] Cc: Chris Mason [EMAIL PROTECTED] Cc: Vladimir V. Saveliev [EMAIL PROTECTED] Cc: Peter Zijlstra [EMAIL PROTECTED] Yep looks good, want me to push this through the lockdep tree, or will you forward it? In which case: Acked-by: Peter Zijlstra [EMAIL PROTECTED] Signed-off

Re: [patch 6/8] pull RT tasks

2007-10-19 Thread Peter Zijlstra
On Fri, 2007-10-19 at 21:24 +0200, Peter Zijlstra wrote: On Fri, 2007-10-19 at 14:43 -0400, Steven Rostedt wrote: plain text document attachment (rt-balance-pull-tasks.patch) +static int pull_rt_task(struct rq *this_rq) +{ + struct task_struct *next; + struct task_struct *p

Re: [patch 6/8] pull RT tasks

2007-10-19 Thread Peter Zijlstra
On Fri, 2007-10-19 at 14:43 -0400, Steven Rostedt wrote: plain text document attachment (rt-balance-pull-tasks.patch) +static int pull_rt_task(struct rq *this_rq) +{ + struct task_struct *next; + struct task_struct *p; + struct rq *src_rq; + int this_cpu = this_rq-cpu; +

Re: [BLOCK2MTD] WARNING: at kernel/lockdep.c:2331 lockdep_init_map()

2007-10-19 Thread Peter Zijlstra
On Fri, 2007-10-19 at 13:53 -0400, Erez Zadok wrote: I've been having this problem for some time with mtd, which I use to mount jffs2 images (for unionfs testing). I've seen it in several recent major kernels, including 2.6.24. Here's the sequence of ops I perform: # cp jffs2-empty.img

Re: [NET]: Fix possible dev_deactivate race condition

2007-10-19 Thread Peter Zijlstra
On Fri, 2007-10-19 at 13:36 +0800, Herbert Xu wrote: On Fri, Oct 19, 2007 at 12:20:25PM +0800, Herbert Xu wrote: In fact this bug exists elsewhere too. For example, the network stack does this in net/sched/sch_generic.c: /* Wait for outstanding qdisc_run calls. */

Re: 100% iowait on one of cpus in current -git

2007-10-22 Thread Peter Zijlstra
On Mon, 2007-10-22 at 08:22 +0200, Maxim Levitsky wrote: Hi, I found a bug in current -git: On my system on of cpus stays 100% in iowait mode (I have core 2 duo) Otherwise the system works OK, no disk activity and/or slowdown. Suspecting that this is a swap-related problem I tried to turn

[RFC/PATCH 0/3] rt: workqueue PI support

2007-10-22 Thread Peter Zijlstra
Hi, I revived the PI-workqueue effort. I took daniel's plist patch and went about fixing the only outstanding issue that I could remember, barriers. This patch compiles and boots on x86_64, not much testing has been done. -- - To unsubscribe from this list: send the line unsubscribe

[RFC/PATCH 2/3] rt: PI-workqueue support

2007-10-22 Thread Peter Zijlstra
-off-by: Daniel Walker [EMAIL PROTECTED] Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- include/linux/workqueue.h |7 --- kernel/power/poweroff.c |1 + kernel/sched.c|4 kernel/workqueue.c| 40 +--- 4 files changed

[RFC/PATCH 3/3] rt: PI-workqueue: fix barriers

2007-10-22 Thread Peter Zijlstra
. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- include/linux/plist.h | 14 ++ kernel/workqueue.c| 104 ++ lib/plist.c | 10 +++- 3 files changed, 110 insertions(+), 18 deletions(-) Index: linux-2.6/include/linux/plist.h

[RFC/PATCH 1/3] rt: rename rt_mutex_setprio to task_setprio

2007-10-22 Thread Peter Zijlstra
With there being multiple non-mutex users of this function its past time it got renamed. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- include/linux/sched.h |7 ++- kernel/rcupreempt-boost.c |4 ++-- kernel/sched.c|8 ++-- 3 files changed, 10 insertions

Re: 100% iowait on one of cpus in current -git

2007-10-22 Thread Peter Zijlstra
On Mon, 2007-10-22 at 11:59 +0200, Maxim Levitsky wrote: On Monday 22 October 2007 11:41:57 Peter Zijlstra wrote: On Mon, 2007-10-22 at 08:22 +0200, Maxim Levitsky wrote: Hi, I found a bug in current -git: On my system on of cpus stays 100% in iowait mode (I have core 2 duo

[RFC/PATCH 4/3] rt: PI-workqueue: fixup the barrier prio

2007-10-22 Thread Peter Zijlstra
small fix to the PI stuff, we lost the prio of the barrier waiter. --- Index: linux-2.6/kernel/workqueue.c === --- linux-2.6.orig/kernel/workqueue.c +++ linux-2.6/kernel/workqueue.c @@ -264,6 +264,7 @@ struct wq_full_barrier {

Re: [RFC/PATCH 2/3] rt: PI-workqueue support

2007-10-22 Thread Peter Zijlstra
On Mon, 2007-10-22 at 08:00 -0400, Steven Rostedt wrote: -- On Mon, 22 Oct 2007, Peter Zijlstra wrote: 5B Index: linux-2.6/kernel/workqueue.c === --- linux-2.6.orig/kernel/workqueue.c +++ linux-2.6/kernel/workqueue.c

[RFC/PATCH 5/3] rt: PI-workqueue: fixup the barrier prio

2007-10-22 Thread Peter Zijlstra
Steven is right in that I did over-user normal_prio a bit. the barriers should use the boosted prio. --- Index: linux-2.6/kernel/workqueue.c === --- linux-2.6.orig/kernel/workqueue.c +++ linux-2.6/kernel/workqueue.c @@ -387,7 +387,7

Re: [RFC/PATCH 3/3] rt: PI-workqueue: fix barriers

2007-10-22 Thread Peter Zijlstra
On Mon, 2007-10-22 at 21:34 +0400, Oleg Nesterov wrote: On 10/22, Peter Zijlstra wrote: @@ -136,10 +138,10 @@ static void insert_work(struct cpu_workq */ smp_wmb(); plist_node_init(work-entry, prio); - plist_add(work-entry, cwq-worklist); + __plist_add(work-entry

Re: [PATCH] reiserfs: don't drop PG_dirty when releasing sub-page-sized dirty file

2007-10-23 Thread Peter Zijlstra
Levitsky [EMAIL PROTECTED] Cc: Peter Zijlstra [EMAIL PROTECTED] Signed-off-by: Fengguang Wu [EMAIL PROTECTED] --- fs/reiserfs/stree.c |3 --- 1 file changed, 3 deletions(-) --- linux-2.6.24-git17.orig/fs/reiserfs

[RFC/PATCH 1/5] rt: rename rt_mutex_setprio to task_setprio

2007-10-23 Thread Peter Zijlstra
With there being multiple non-mutex users of this function its past time it got renamed. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- include/linux/sched.h |7 ++- kernel/rcupreempt-boost.c |4 ++-- kernel/sched.c|8 ++-- 3 files changed, 10 insertions

[RFC/PATCH 3/5] rt: plist_head_splice

2007-10-23 Thread Peter Zijlstra
merge-sort two plists together Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- include/linux/plist.h |2 + lib/plist.c | 68 -- 2 files changed, 68 insertions(+), 2 deletions(-) Index: linux-2.6/include/linux/plist.h

[RFC/PATCH 5/5] rt: PI-workqueue: fix barriers

2007-10-23 Thread Peter Zijlstra
this barrier stack. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- kernel/workqueue.c | 111 + 1 file changed, 95 insertions(+), 16 deletions(-) Index: linux-2.6/kernel/workqueue.c

[RFC/PATCH 0/5] rt: workqueue PI support -v2

2007-10-23 Thread Peter Zijlstra
Still not more than boot tested,... Oleg, do you have workqueue test modules? Changes since -v1: - proper plist_head_splice() implementation - removed the plist_add(, .tail) thing, using prio -1 instead. (patch against v2.6.23-rt1) -- - To unsubscribe from this list: send the line

[RFC/PATCH 2/5] rt: list_splice2

2007-10-23 Thread Peter Zijlstra
Introduce list_splice2{,_tail}() which will splice a sub-list denoted by two list items instead of the full list. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- drivers/usb/host/ehci-q.c |2 - include/linux/list.h | 66 -- lib

[RFC/PATCH 4/5] rt: PI-workqueue support

2007-10-23 Thread Peter Zijlstra
-off-by: Daniel Walker [EMAIL PROTECTED] Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- include/linux/workqueue.h |7 --- kernel/power/poweroff.c |1 + kernel/workqueue.c| 40 +--- 3 files changed, 30 insertions(+), 18 deletions

Re: mm: soft lockup in 2.6.23-6636. caused by drop_caches ?

2007-10-23 Thread Peter Zijlstra
On Tue, 2007-10-23 at 14:55 +0100, richard kennedy wrote: on git v2.6.23-6636-g557ebb7 I'm getting a soft lockup when running a simple disk write test case on AMD64X2, sata hd ext3. the test does this sync echo 3 /proc/sys/vm/drop_caches for (( i=0; $i $count; i=$i+1 )) ; do dd

Re: [RFC/PATCH 3/5] rt: plist_head_splice

2007-10-23 Thread Peter Zijlstra
On Tue, 2007-10-23 at 11:10 -0400, Steven Rostedt wrote: -- On Tue, 23 Oct 2007, Peter Zijlstra wrote: + +void plist_head_splice(struct plist_head *src, struct plist_head *dst) +{ + struct plist_node *src_iter_first, *src_iter_last, *dst_iter; + struct plist_node *tail

Re: [RFC/PATCH 3/5] rt: plist_head_splice

2007-10-23 Thread Peter Zijlstra
Index: linux-2.6/lib/plist.c === --- linux-2.6.orig/lib/plist.c +++ linux-2.6/lib/plist.c @@ -167,8 +167,8 @@ void plist_head_splice(struct plist_head list_del_init(src_iter_first-plist.prio_list); if

[RFC/PATCH 6/5] rt: PI-workqueue: wait_on_work() fixup

2007-10-23 Thread Peter Zijlstra
. [ will be folded into the previous patch on next posting ] Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- kernel/workqueue.c | 74 - 1 file changed, 29 insertions(+), 45 deletions(-) Index: linux-2.6/kernel/workqueue.c

[RFC/PATCH 7/5] rt: PI-workqueue: propagate prio for delayed work

2007-10-23 Thread Peter Zijlstra
Subject: rt: PI-workqueue: propagate prio for delayed work Delayed work looses its enqueue priority, and will be enqueued on the prio of the softirq thread. Ammend this. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- include/linux/workqueue.h |1 + kernel/workqueue.c| 16

Re: 2.6.24-rc1 fails with lockup and BUG:

2007-10-24 Thread Peter Zijlstra
with a solution for this. Does this help? --- Subject: lockdep: invalid irq usage this function can be called from hardirq context. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- Index: linux-2.6-2/kernel/sched_debug.c

Re: [RFC PATCH] x86: explicit call to mmiotrace in do_page_fault()

2008-02-09 Thread Peter Zijlstra
On Sat, 2008-02-09 at 10:01 -0800, Arjan van de Ven wrote: default n help This will build a kernel module called mmiotrace. + Making this a built-in is heavily discouraged. why is this? Wouldn't it be nice if distros just shipped with this in their kernel by default

Re: [RFC PATCH] x86: explicit call to mmiotrace in do_page_fault()

2008-02-09 Thread Peter Zijlstra
On Sat, 2008-02-09 at 19:52 +0200, Pekka Paalanen wrote: +int mmiotrace_register_pf(pf_handler_func new_pfh) { + int ret = 0; unsigned long flags; + spin_lock_irqsave(mmiotrace_handler_lock, flags); + if (mmiotrace_pf_handler) + ret = -EBUSY; + else +

Re: [PATCH] [6/8] Account overlapped mappings in end_pfn_map

2008-02-11 Thread Peter Zijlstra
On Mon, 2008-02-11 at 14:27 +0100, Andi Kleen wrote: Ok patch with hungarized variables appended. -static void __meminit +static unsigned long __meminit phys_pmd_update(pud_t *pud, unsigned long address, unsigned long end) { + unsigned long true_end; pmd_t *pmd =

Re: lock_task_group_list() can be called from the atomic context

2008-02-11 Thread Peter Zijlstra
[8020b0dd] ? default_idle+0x43/0x76 [8020b0db] ? default_idle+0x41/0x76 [8020b09a] ? default_idle+0x0/0x76 [8020b186] ? cpu_idle+0x76/0x98 separate the tg-shares protection from the task_group lock. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- kernel/sched.c | 37

Re: [git pull for -mm] CPU isolation extensions (updated2)

2008-02-12 Thread Peter Zijlstra
On Mon, 2008-02-11 at 20:10 -0800, Max Krasnyansky wrote: Andrew, looks like Linus decided not to pull this stuff. Can we please put it into -mm then. My tree is here git://git.kernel.org/pub/scm/linux/kernel/git/maxk/cpuisol-2.6.git Please use 'master' branch (or 'for-linus' they

Re: Regression in latest sched-git

2008-02-12 Thread Peter Zijlstra
On Wed, 2008-02-13 at 00:23 +0530, Dhaval Giani wrote: Hi Ingo, I've been running the latest sched-git through some tests. Here is essentially what I am doing, 1. Mount the control group 2. Create 3-4 groups 3. Start kernbench inside each group 4. Run cpu hogs in each group

Re: 2.6.24-git2: Oracle 11g VKTM process enters R state on startup and is unkillable [still broken in 2.6.25-rc1]

2008-02-12 Thread Peter Zijlstra
On Tue, 2008-02-12 at 15:35 +0100, Alessandro Suardi wrote: On Feb 12, 2008 2:44 PM, Peter Zijlstra [EMAIL PROTECTED] wrote: On Tue, 2008-02-12 at 00:12 +0100, Rafael J. Wysocki wrote: On Monday, 11 of February 2008, Ingo Molnar wrote: * Ingo Molnar [EMAIL PROTECTED] wrote

[RFC][PATCH] sched: fair-group: load_balance_monitor vs wakeups

2008-02-12 Thread Peter Zijlstra
option was turning it into a timer, this does work however it now runs from hardirq context and I worry that it might be too heavy, esp on larger boxen. However, if we split the global lb_monitor into a per root-domain monitor I think it might be doable,.. thoughts? Signed-off-by: Peter Zijlstra

Re: [RFC][PATCH] sched: fair-group: load_balance_monitor vs wakeups

2008-02-12 Thread Peter Zijlstra
On Tue, 2008-02-12 at 13:59 +0100, Peter Zijlstra wrote: + printk(KERN_EMERG load_balance_shares: %p %d\n, lb_monitor, state); Uhm,.. seems I forgot to refresh after removing the debug info.. :-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body

Re: [RFC PATCH] RTTIME watchdog timer proc interface

2008-02-13 Thread Peter Zijlstra
On Tue, 2008-02-12 at 14:21 -0800, Hiroshi Shimamoto wrote: Peter Zijlstra wrote: On Mon, 2008-02-11 at 13:44 -0800, Hiroshi Shimamoto wrote: Hi Ingo, I think an interface to access RLIMIT_RTTIME from outside is useful. It makes administrator able to set RLIMIT_RTTIME watchdog

Re: Regression in latest sched-git

2008-02-13 Thread Peter Zijlstra
On Wed, 2008-02-13 at 08:30 +0530, Srivatsa Vaddagiri wrote: On Tue, Feb 12, 2008 at 08:40:08PM +0100, Peter Zijlstra wrote: Yes, latency isolation is the one thing I had to sacrifice in order to get the normal latencies under control. Hi Peter, I don't have easy solution in mind

Re: [BUG] snd-hda-intel

2008-02-13 Thread Peter Zijlstra
On Wed, 2008-02-13 at 15:45 +0100, Takashi Iwai wrote: See /proc/asound/card0/codec#* files. Better to run once alsa-info.sh and show its output: http://hg.alsa-project.org/alsa/raw-file/tip/alsa-info.sh http://pastebin.ca/902469 -- To unsubscribe from this list: send the line

Re: [BUG] snd-hda-intel

2008-02-13 Thread Peter Zijlstra
On Wed, 2008-02-13 at 15:46 +0100, Takashi Iwai wrote: At Wed, 13 Feb 2008 15:41:06 +0100, Peter Zijlstra wrote: Is pulseaudio 32bit? Yes, 32bit userspace. One thing forgot to ask: Can you reproduce the bug with other apps? With 64bit apps? I don't currently have 64bit

Re: [BUG] snd-hda-intel

2008-02-13 Thread Peter Zijlstra
On Wed, 2008-02-13 at 15:31 +0100, Takashi Iwai wrote: At Wed, 13 Feb 2008 15:25:14 +0100, Peter Zijlstra wrote: lspci -vvv: 00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 02) Is this a regression, i.e. did you get similar Oops

Re: [BUG] snd-hda-intel

2008-02-13 Thread Peter Zijlstra
On Wed, 2008-02-13 at 16:39 +0100, Takashi Iwai wrote: At Wed, 13 Feb 2008 15:55:37 +0100, Peter Zijlstra wrote: On Wed, 2008-02-13 at 15:45 +0100, Takashi Iwai wrote: See /proc/asound/card0/codec#* files. Better to run once alsa-info.sh and show its output: http://hg.alsa

Re: Regression in latest sched-git

2008-02-13 Thread Peter Zijlstra
On Wed, 2008-02-13 at 22:07 +0530, Dhaval Giani wrote: On Wed, Feb 13, 2008 at 10:04:44PM +0530, Dhaval Giani wrote: On the same lines, I cant understand how we can be seeing 700ms latency (below) unless we had: large number of active groups/users and large number of tasks

Re: [REGRESSION] 2.6.25-rc1 does not boot on Alpha

2008-02-13 Thread Peter Zijlstra
On Tue, 2008-02-12 at 23:29 -0600, Bob Tracy wrote: This isn't going to be terribly useful other than giving someone a heads-up there's a problem with something in 2.6.25-rc1 on the Alpha PWS 433au. I get the usual messages out of aboot, including aboot: zero-filling 210392 bytes at

[PATCH] xtime_lock vs update_process_times

2008-02-13 Thread Peter Zijlstra
This time with LKML CC'ed. Sorry for the duplication. Subject: xtime_lock vs update_process_times From: Peter Zijlstra [EMAIL PROTECTED] ( repost from: http://lkml.org/lkml/2008/1/28/101 ) Commit: d3d74453c34f8fd87674a8cf5b8a327c68f22e99 Subject: hrtimer: fixup

Re: Regression in latest sched-git

2008-02-14 Thread Peter Zijlstra
Hi Dhaval, How does this patch (on top of todays sched-devel.git) work for you? It keeps my laptop nice and spiffy when I run let i=0; while [ $i -lt 100 ]; do let i+=1; while :; do :; done done under a third user (nobody). This generates huge latencies for the nobody user (up to 1.6s) but

[RFC][PATCH 1/2] sched: fair-group: rework load_balance_monitor

2008-02-14 Thread Peter Zijlstra
option was turning it into a timer, this does work however it now runs from hardirq context and I worry that it might be too heavy, esp on larger boxen. The next patch will split this single instance into per root-domain balancers. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- kernel/sched.c

[RFC][PATCH 0/2] reworking load_balance_monitor

2008-02-14 Thread Peter Zijlstra
Hi, Here the current patches that rework load_balance_monitor. The main reason for doing this is to eliminate the wakeups the thing generates, esp. on an idle system. The bonus is that it removes a kernel thread. Paul, Gregory - the thing that bothers me most atm is the lack of rd-load_balance.

[RFC][PATCH 2/2] sched: fair-group: per root-domain load balancing

2008-02-14 Thread Peter Zijlstra
Currently the lb_monitor will walk all the domains/cpus from a single cpu's timer interrupt. This will cause massive cache-trashing and cache-line bouncing on larger machines. Split the lb_monitor into root_domain (disjoint sched-domains). Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] CC

[PATCH 0/2] for sched-devel.git

2008-02-15 Thread Peter Zijlstra
Hi Ingo, Would you stick these into sched-devel. The first patch should address the latency isolation issue. While the second rectifies a massive brainfart :-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo

[PATCH 1/2] sched: fair: virtual deadline scheduling

2008-02-15 Thread Peter Zijlstra
to meet. This includes the latency into the scheduling decision. [*] - EDF is correct up until load 1, after that it is not a closed system so improvement is possible here. It is usable because the system strives to generate the load 1 situation. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED

[PATCH 2/2] sched: fair: fix calc_delta_asym

2008-02-15 Thread Peter Zijlstra
The goal of calc_delta_asym() is to be asymetrically around NICE_0_LOAD, in that it favours =0 over 0. The current implementation does not achieve that. -20 | | 0 +--- .' 19 .' Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- kernel

Re: 2.6.24-mm1 bugs

2008-02-15 Thread Peter Zijlstra
On Fri, 2008-02-15 at 12:43 +0100, Miklos Szeredi wrote: - strange key repeating (short press of a key results in lots of key press events) when there's some sort of load (I/O?) I may have seen this on non-mm kernels as well, but it's definitely more noticable in -mm Do you have

Re: [PATCH 3/4] IPMI: convert locked counters to atomics

2008-02-15 Thread Peter Zijlstra
On Thu, 2008-02-14 at 12:30 -0600, Corey Minyard wrote: +/* + * Various statistics for IPMI, these index stats[] in the ipmi_smi + * structure. + */ +/* Commands we got from the user that were invalid. */ +#define IPMI_STAT_sent_invalid_commands 0 + +/* Commands we

Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Peter Zijlstra
On Mon, 2012-10-29 at 21:39 +0100, Stephane Eranian wrote: But I think the right mechanism would be one where you can add events at boot time based on CPU model. It could be used to add the common events as well in the common part of the init code. mlin once posted something like that, it

Re: [PATCH V2 RFC 3/3] kvm: Check system load and handle different commit cases accordingly

2012-10-30 Thread Peter Zijlstra
On Tue, 2012-10-30 at 11:27 +0530, Raghavendra K T wrote: Okay, now IIUC, usage of *any* global measure is bad? Yep, people like to carve up their machines, esp. now that they're somewhat bigger than they used to be. This can result in very asymmetric loads, no global measure can ever deal with

Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods

2012-10-30 Thread Peter Zijlstra
On Tue, 2012-10-30 at 15:59 +0900, Namhyung Kim wrote: Yes, the callchain part needs to be improved. Peter's idea indeed looks good to me too. FWIW, I think this is exactly what sysprof does, except that tool isn't usable for other reasons.. You might want to look at it though. -- To

Re: [GIT PULL 0/9] perf/core improvements and fixes

2012-10-30 Thread Peter Zijlstra
On Tue, 2012-10-30 at 09:18 +0100, Ingo Molnar wrote: The optimal way, I guess, would be to have some cache file with the results of such feature tests, that would be created and then used till the build fails using its findings, which would trigger a new feature check round,

Re: [RFC][PATCH] perf: Add a few generic stalled-cycles events

2012-10-31 Thread Peter Zijlstra
On Tue, 2012-10-30 at 23:40 -0700, Sukadev Bhattiprolu wrote: So instead of the names I came up with in this patch, stalled-cycles-fixed-point we could use the name used in the CPU spec - 'cmplu_stall_fxu' in the arch specific code ? You could, but I would advise against it. Human readable

Re: [PATCH tip/core/rcu 0/2] v2 Add callback-free CPUs

2012-10-31 Thread Peter Zijlstra
On Tue, 2012-10-30 at 20:45 -0700, Paul E. McKenney wrote: This commit therefore adds the ability for selected CPUs (rcu_nocbs= boot parameter) to have their callbacks offloaded to kthreads, inspired by Joe Korty's and Jim Houston's JRCU. If the rcu_nocb_poll boot parameter is also specified,

Re: [PATCH 1/4] uprobes: Kill set_swbp()-is_swbp_at_addr()

2012-09-24 Thread Peter Zijlstra
On Sun, 2012-09-23 at 22:19 +0200, Oleg Nesterov wrote: A separate patch for better documentation. set_swbp()-is_swbp_at_addr() is not needed for correctness, it is harmless to do the unnecessary __replace_page(old_page, new_page) when these 2 pages are identical. And it can not be

Re: [PATCH 3/4] uprobes: Kill set_orig_insn()-is_swbp_at_addr()

2012-09-24 Thread Peter Zijlstra
On Sun, 2012-09-23 at 22:19 +0200, Oleg Nesterov wrote: @@ -226,6 +245,10 @@ retry: Could you use: $ cat ~/.gitconfig [diff default] xfuncname = ^[[:alpha:]$_].*[^:]$ This avoids git-diff it using labels as function names. if (ret = 0) return ret; + ret

Re: [PATCH] Update sched_domains_numa_masks when new cpus are onlined.

2012-09-24 Thread Peter Zijlstra
Why are you cc'ing x86 and numa folks but not a single scheduler person when you're patching scheduler stuff? On Tue, 2012-09-18 at 18:12 +0800, Tang Chen wrote: Once array sched_domains_numa_masks is defined, it is never updated. When a new cpu on a new node is onlined, Hmm, so there's

Re: [PATCH] Update sched_domains_numa_masks when new cpus are onlined.

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 15:27 +0530, Srivatsa S. Bhat wrote: On 09/24/2012 03:08 PM, Peter Zijlstra wrote: + hotcpu_notifier(sched_domains_numa_masks_update, CPU_PRI_SCHED_ACTIVE); hotcpu_notifier(cpuset_cpu_active, CPU_PRI_CPUSET_ACTIVE); hotcpu_notifier

Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler

2012-09-24 Thread Peter Zijlstra
On Fri, 2012-09-21 at 17:30 +0530, Raghavendra K T wrote: +unsigned long rq_nr_running(void) +{ + return this_rq()-nr_running; +} +EXPORT_SYMBOL(rq_nr_running); Uhm,.. no, that's a horrible thing to export. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in

Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler

2012-09-24 Thread Peter Zijlstra
On Fri, 2012-09-21 at 17:29 +0530, Raghavendra K T wrote: In some special scenarios like #vcpu = #pcpu, PLE handler may prove very costly, because there is no need to iterate over vcpus and do unsuccessful yield_to burning CPU. What's the costly thing? The vm-exit, the yield (which should be

Re: [PATCH 2/2] [RESEND] console: implement lockdep support for console_lock

2012-09-24 Thread Peter Zijlstra
On Tue, 2012-09-18 at 01:03 +0200, Daniel Vetter wrote: - In the printk code there's a special trylock, only used to kick off the logbuffer printk'ing in console_unlock. But all that happens while lockdep is disable (since printk does a few other evil tricks). So no issue there, either.

Re: [PATCH 2/2] [RESEND] console: implement lockdep support for console_lock

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 14:17 +0200, Peter Zijlstra wrote: On Tue, 2012-09-18 at 01:03 +0200, Daniel Vetter wrote: - In the printk code there's a special trylock, only used to kick off the logbuffer printk'ing in console_unlock. But all that happens while lockdep is disable (since printk

Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 17:22 +0530, Raghavendra K T wrote: On 09/24/2012 05:04 PM, Peter Zijlstra wrote: On Fri, 2012-09-21 at 17:29 +0530, Raghavendra K T wrote: In some special scenarios like #vcpu= #pcpu, PLE handler may prove very costly, because there is no need to iterate over vcpus

Re: [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-17 at 13:38 -0300, Rafael Aquini wrote: +static inline void assign_balloon_mapping(struct page *page, + struct address_space *mapping) +{ + page-mapping = mapping; + smp_wmb(); +} + +static inline void

Re: [PATCH 2/2] [RESEND] console: implement lockdep support for console_lock

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 14:54 +0200, Daniel Vetter wrote: I've read through the patches and I'm hoping you don't volunteer me to pick these up ... ;-) Worth a try, right? :-) But there doesn't seem to be anything that would get worse through this lockdep annotation patch, right? No indeed,

Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 18:59 +0530, Raghavendra K T wrote: However Rik had a genuine concern in the cases where runqueue is not equally distributed and lockholder might actually be on a different run queue but not running. Load should eventually get distributed equally -- that's what the

Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 16:00 +0100, Mel Gorman wrote: On Fri, Sep 14, 2012 at 02:42:44PM -0700, Linus Torvalds wrote: On Fri, Sep 14, 2012 at 2:27 PM, Borislav Petkov b...@alien8.de wrote: as Nikolay says below, we have a regression in 3.6 with pgbench's benchmark in postgresql. I

Re: [PATCH RFC 2/2] kvm: Be courteous to other VMs in overcommitted scenario in PLE handler

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 17:26 +0200, Avi Kivity wrote: I think this is a no-op these (CFS) days. To get schedule() to do anything, you need to wake up a task, or let time pass, or block. Otherwise it will see that nothing has changed and as far as it's concerned you're still the best task to be

Re: [PATCH RFC 2/2] kvm: Be courteous to other VMs in overcommitted scenario in PLE handler

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 17:43 +0200, Avi Kivity wrote: Wouldn't this correspond to the scheduler interrupt firing and causing a reschedule? I thought the timer was programmed for exactly the point in time that CFS considers the right time for a switch. But I'm basing this on my mental model of

Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 17:51 +0200, Avi Kivity wrote: On 09/24/2012 03:54 PM, Peter Zijlstra wrote: On Mon, 2012-09-24 at 18:59 +0530, Raghavendra K T wrote: However Rik had a genuine concern in the cases where runqueue is not equally distributed and lockholder might actually

Re: [PATCH RFC 2/2] kvm: Be courteous to other VMs in overcommitted scenario in PLE handler

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 17:58 +0200, Avi Kivity wrote: There is the TSC deadline timer mode of newer Intels. Programming the timer is a simple wrmsr, and it will fire immediately if it already expired. Unfortunately on AMDs it is not available, and on virtual hardware it will be slow (~1-2

Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 08:52 -0700, Linus Torvalds wrote: Your patch looks odd, though. Why do you use some complex initial value for 'candidate' (nr_cpu_ids) instead of a simple and readable one (-1)? nr_cpu_ids is the typical no-value value for cpumask operations -- yes this is annoying and

Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 08:52 -0700, Linus Torvalds wrote: And the whole if we find any non-idle cpu, skip the whole domain logic really seems a bit odd (that's not new to your patch, though). Can somebody explain what the whole point of that idiotically written function is? So we're looking

Re: [PATCH RFC 2/2] kvm: Be courteous to other VMs in overcommitted scenario in PLE handler

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 18:10 +0200, Avi Kivity wrote: Its also still a LAPIC write -- disguised as an MSR though :/ It's probably a whole lot faster though. I've been told its not, I haven't tried it. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a

Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 18:06 +0200, Avi Kivity wrote: We would probably need a -sched_exit() preempt notifier to make this work. Peter, I know how much you love those, would it be acceptable? Where exactly do you want this? TASK_DEAD? or another exit? -- To unsubscribe from this list: send

Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 09:30 -0700, Linus Torvalds wrote: On Mon, Sep 24, 2012 at 9:12 AM, Peter Zijlstra a.p.zijls...@chello.nl wrote: So we're looking for an idle cpu around @target. We prefer a cpu of an idle core, since SMT-siblings share L[12] cache. The way we do

Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 09:33 -0700, Linus Torvalds wrote: Sure, the scan bits bitops will return = nr_cpu_ids for the I couldn't find a bit thing, but that doesn't mean that everything else should. Fair enough.. --- kernel/sched/fair.c | 42 +- 1 file

Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected

2012-09-24 Thread Peter Zijlstra
On Mon, 2012-09-24 at 18:54 +0200, Peter Zijlstra wrote: But let me try and come up with the list thing, I think we've actually got that someplace as well. OK, I'm sure the below can be written better, but my brain is gone for the day... --- include/linux/sched.h | 1 + kernel/sched/core.c

Re: [PATCH] Update sched_domains_numa_masks when new cpus are onlined.

2012-09-25 Thread Peter Zijlstra
On Tue, 2012-09-25 at 10:39 +0800, Tang Chen wrote: @@ -6765,11 +6773,64 @@ static void sched_init_numa(void) } sched_domain_topology = tl; + +sched_domains_numa_levels = level; And I set it to level here again. But its already set there.. its set every time we find

<    1   2   3   4   5   6   7   8   9   10   >