Re: [PATCH 6/6] x86: Exit RCU extended QS on notify resume

2012-07-10 Thread Frederic Weisbecker
On Sun, Jul 08, 2012 at 02:17:07PM -0700, Paul E. McKenney wrote: On Fri, Jul 06, 2012 at 01:43:29PM -0700, Josh Triplett wrote: On Fri, Jul 06, 2012 at 09:33:38AM -0700, Paul E. McKenney wrote: On Fri, Jul 06, 2012 at 02:00:18PM +0200, Frederic Weisbecker wrote: --- a/arch/x86/Kconfig

Re: [PATCH] trace: add ability to set a target task for events (v2)

2012-07-11 Thread Frederic Weisbecker
On Wed, Jul 11, 2012 at 06:14:58PM +0400, Andrew Vagin wrote: A few events are interesting not only for a current task. For example, sched_stat_* are interesting for a task, which wake up. For this reason, it will be good, if such events will be delivered to a target task too. Now a target

Re: [PATCH] trace: add ability to set a target task for events (v2)

2012-07-11 Thread Frederic Weisbecker
On Wed, Jul 11, 2012 at 04:33:41PM +0200, Peter Zijlstra wrote: On Wed, 2012-07-11 at 16:31 +0200, Frederic Weisbecker wrote: On Wed, Jul 11, 2012 at 06:14:58PM +0400, Andrew Vagin wrote: A few events are interesting not only for a current task. For example, sched_stat_* are interesting

Re: [PATCH] trace: add ability to set a target task for events (v2)

2012-07-11 Thread Frederic Weisbecker
On Wed, Jul 11, 2012 at 04:38:19PM +0200, Peter Zijlstra wrote: On Wed, 2012-07-11 at 16:36 +0200, Frederic Weisbecker wrote: In this case he can just record sched wakeup as well. With sched_switch + sched_wakeup, he'll unlikely lose events. With sched_stat_sleep he will lose events

Re: [PATCH] trace: add ability to set a target task for events (v2)

2012-07-11 Thread Frederic Weisbecker
On Wed, Jul 11, 2012 at 04:55:08PM +0200, Peter Zijlstra wrote: On Wed, 2012-07-11 at 16:48 +0200, Frederic Weisbecker wrote: On Wed, Jul 11, 2012 at 04:38:19PM +0200, Peter Zijlstra wrote: On Wed, 2012-07-11 at 16:36 +0200, Frederic Weisbecker wrote: In this case he can just record

Re: [PATCH] trace: add ability to set a target task for events (v2)

2012-07-11 Thread Frederic Weisbecker
On Wed, Jul 11, 2012 at 05:12:04PM +0200, Peter Zijlstra wrote: On Wed, 2012-07-11 at 16:55 +0200, Peter Zijlstra wrote: Right.. back when I did that the plan was to make PERF_SAMPLE_PERIOD fix that, of course that never seemed to have happened. With PERF_SAMPLE_PERIOD you can simply

[RFC PATCH 00/11] rcu: Userspace RCU extended quiescent state v2

2012-07-11 Thread Frederic Weisbecker
and the overhead is lowered in the off-case. This can be even better if we use jump labels later. Thanks. git://github.com/fweisbec/linux-dynticks.git rcu/user-2 Frederic Weisbecker (11): rcu: Settle config for userspace extended quiescent state rcu: Allow rcu_user_enter()/exit

[PATCH 04/11] rcu: Switch task's syscall hooks on context switch

2012-07-11 Thread Frederic Weisbecker
Clear the syscalls hook of a task when it's scheduled out so that if the task migrates, it doesn't run the syscall slow path on a CPU that might not need it. Also set the syscalls hook on the next task if needed. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog

[PATCH 05/11] x86: Syscall hooks for userspace RCU extended QS

2012-07-11 Thread Frederic Weisbecker
Add syscall slow path hooks to notify syscall entry and exit on CPUs that want to support userspace RCU extended quiescent state. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity

[PATCH 06/11] x86: Exception hooks for userspace RCU extended QS

2012-07-11 Thread Frederic Weisbecker
Add necessary hooks to x86 exception for userspace RCU extended quiescent state support. This includes traps, page fault, debug exceptions, etc... Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi

[PATCH 09/11] x86: Use the new schedule_user API on userspace preemption

2012-07-11 Thread Frederic Weisbecker
This way we can exit the RCU extended quiescent state before we schedule a new task from irq/exception exit. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris

[PATCH 10/11] x86: Exit RCU extended QS on notify resume

2012-07-11 Thread Frederic Weisbecker
back rcu_user_enter() after this function because we know we are going to userspace from there. This complete support for userspace RCU extended quiescent state in x86-64. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux

[PATCH 11/11] rcu: Userspace RCU extended QS selftest

2012-07-11 Thread Frederic Weisbecker
Provide a config option that enables the userspace RCU extended quiescent state on every CPUs by default. This is for testing purpose. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity

[PATCH 07/11] rcu: Exit RCU extended QS on kernel preemption after irq/exception

2012-07-11 Thread Frederic Weisbecker
before we call it. To solve this, just call rcu_user_exit() in the beginning of preempt_schedule_irq(). Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc

[PATCH 01/11] rcu: Settle config for userspace extended quiescent state

2012-07-11 Thread Frederic Weisbecker
Create a new config option under the RCU menu that put CPUs under RCU extended quiescent state (as in dynticks idle mode) when they run in userspace. This require some contribution from architectures to hook into kernel and userspace boundaries. Signed-off-by: Frederic Weisbecker fweis

[PATCH 02/11] rcu: Allow rcu_user_enter()/exit() to nest

2012-07-11 Thread Frederic Weisbecker
userspace before calling rcu_user_exit(). Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c...@linux.com Cc: Geoff

[PATCH 03/11] rcu: Ignore userspace extended quiescent state by default

2012-07-11 Thread Frederic Weisbecker
By default we don't want to enter into RCU extended quiescent state while in userspace because doing this produces some overhead (eg: use of syscall slowpath). Set it off by default and ready to run when some feature like adaptive tickless need it. Signed-off-by: Frederic Weisbecker fweis

[PATCH 08/11] rcu: Exit RCU extended QS on user preemption

2012-07-11 Thread Frederic Weisbecker
() and the irq has called rcu_irq_exit() already. Create a new API schedule_user() that calls schedule() inside rcu_user_exit()-rcu_user_enter() in order to protect it. Archs will need to rely on it now to implement user preemption safely. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor

Re: Fork bomb limitation in memcg WAS: Re: [PATCH 00/11] kmem controller for memcg: stripped down version

2012-07-12 Thread Frederic Weisbecker
On Tue, Jul 03, 2012 at 03:38:39PM +0400, Glauber Costa wrote: On 06/29/2012 02:25 AM, Andrew Morton wrote: On Thu, 28 Jun 2012 13:01:23 +0400 Glauber Costa glom...@parallels.com wrote: ... OK, that all sounds convincing ;) Please summarise and capture this discussion in the

Re: [PATCH 03/17] perf, x86: Add copy_from_user_nmi_nochk for best effort copy

2012-07-25 Thread Frederic Weisbecker
On Sun, Jul 22, 2012 at 02:14:26PM +0200, Jiri Olsa wrote: Adding copy_from_user_nmi_nochk that provides the best effort copy regardless the requesting size crossing the task boundary. This is going to be useful for stack dump we need in post DWARF CFI based unwind, where we have predefined

Re: [PATCH 01/17] perf: Unified API to record selective sets of arch registers

2012-07-25 Thread Frederic Weisbecker
if needed in the future. Signed-off-by: Jiri Olsa jo...@redhat.com Original-patch-by: Frederic Weisbecker fweis...@gmail.com Acked-by: Frederic Weisbecker fweis...@gmail.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org

Re: [PATCH 02/17] perf: Add ability to attach user level registers dump to sample

2012-07-25 Thread Frederic Weisbecker
. This is going to be useful to bring Dwarf CFI based stack unwinding on top of samples. Signed-off-by: Jiri Olsa jo...@redhat.com Original-patch-by: Frederic Weisbecker fweis...@gmail.com Acked-by: Frederic Weisbecker fweis...@gmail.com -- To unsubscribe from this list: send the line unsubscribe

Re: [PATCH 01/10] sched: select_task_rq_fair clean up

2012-12-06 Thread Frederic Weisbecker
2012/12/3 Alex Shi alex@intel.com: It is impossible to miss a task allowed cpu in a eligible group. And since find_idlest_group only return a different group which excludes old cpu, it's also imporissible to find a new cpu same as old cpu. Is it possible for weighted_cpuload() to return

Re: [PATCH 02/10] sched: fix find_idlest_group mess logical

2012-12-06 Thread Frederic Weisbecker
2012/12/3 Alex Shi alex@intel.com: There is 4 situations in the function: 1, no task allowed group; so min_load = ULONG_MAX, this_load = 0, idlest = NULL 2, only local group task allowed; so min_load = ULONG_MAX, this_load assigned, idlest = NULL 3, only non-local task

Re: [PATCH 01/10] sched: select_task_rq_fair clean up

2012-12-06 Thread Frederic Weisbecker
2012/12/7 Alex Shi alex@intel.com: On 12/07/2012 01:50 AM, Frederic Weisbecker wrote: 2012/12/3 Alex Shi alex@intel.com: It is impossible to miss a task allowed cpu in a eligible group. And since find_idlest_group only return a different group which excludes old cpu, it's also

Re: [PATCH 02/10] sched: fix find_idlest_group mess logical

2012-12-07 Thread Frederic Weisbecker
2012/12/7 Alex Shi alex@intel.com: On 12/07/2012 08:56 AM, Frederic Weisbecker wrote: 2012/12/3 Alex Shi alex@intel.com: There is 4 situations in the function: 1, no task allowed group; so min_load = ULONG_MAX, this_load = 0, idlest = NULL 2, only local group task allowed

Re: [PATCH v2 00/14] printk() fixes, optimizations, and clean ups

2012-12-07 Thread Frederic Weisbecker
to printk that should likely be applied in some form for -rc1 or earlier: Sylvain Munaut's (cc'd) print_time fix: https://patchwork.kernel.org/patch/1845971/ I don't know of anything else that likely will or should be applied before -rc1. Frederic Weisbecker (also cc'd) has a printk nohz

Re: [GIT PULL v2] printk: Make it usable on nohz cpus

2012-12-08 Thread Frederic Weisbecker
2012/12/8 Ingo Molnar mi...@kernel.org: * Frederic Weisbecker fweis...@gmail.com wrote: Ingo, Please pull the printk support in dynticks mode patches that can be found at: git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git tags/printk-dynticks-for-mingo-v2

Re: [context tracking subsystem] Re: [GIT PULL rcu/next] One more RCU commit for 3.8

2012-12-10 Thread Frederic Weisbecker
Thanx, Paul Frederic Weisbecker (1): context_tracking: New context tracking susbsystem arch/Kconfig | 15 ++-- arch/x86/Kconfig |2 +- arch/x86/include/asm/{rcu.h = context_tracking.h

Re: [PATCH] nohz/cpuset: Make a CPU stick with do_timer() duty in the presence of nohz cpusets

2012-11-19 Thread Frederic Weisbecker
2012/11/20 Steven Rostedt rost...@goodmis.org: On Mon, 2012-11-19 at 17:27 -0700, Hakan Akkan wrote: I suggest to rather define a tunable timekeeping duty CPU affinity in a cpumask file at /sys/devices/system/cpu/timekeeping and a toggle at /sys/devices/system/cpu/cpuX/timekeeping (like

[GIT PULL] cputime: cleanups

2012-11-20 Thread Frederic Weisbecker
time accounting APIs) Thanks. --- Some more cputime cleanups: * Get rid of underscores polluting the vtime namespace * Consolidate context switch and tick handling * Improve debuggability by detecting irq unsafe callers Signed-off-by: Frederic Weisbecker fweis...@gmail.com --- Frederic

[PATCH 1/5] vtime: Remove the underscore prefix invasion

2012-11-20 Thread Frederic Weisbecker
() for this specific case so that we can remove the underscore prefix on other vtime functions. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Reviewed-by: Steven Rostedt rost...@goodmis.org Cc: Peter Zijlstra pet...@infradead.org Cc: Ingo Molnar mi...@kernel.org Cc: Thomas Gleixner t...@linutronix.de Cc

[PATCH 2/5] vtime: Explicitly account pending user time on process tick

2012-11-20 Thread Frederic Weisbecker
All vtime implementations just flush the user time on process tick. Consolidate that in generic code by calling a user time accounting helper. This avoids an indirect call in ia64 and prepare to also consolidate vtime context switch code. Signed-off-by: Frederic Weisbecker fweis...@gmail.com

[PATCH 4/5] vtime: No need to disable irqs on vtime_account()

2012-11-20 Thread Frederic Weisbecker
vtime_account() is only called from irq entry. irqs are always disabled at this point so we can safely remove the irq disabling guards on that function. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Reviewed-by: Steven Rostedt rost...@goodmis.org Cc: Peter Zijlstra pet...@infradead.org Cc

[PATCH 5/5] vtime: Warn if irqs aren't disabled on system time accounting APIs

2012-11-20 Thread Frederic Weisbecker
or we missed something. Suggested-by: Steven Rostedt rost...@goodmis.org Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Peter Zijlstra pet...@infradead.org Cc: Ingo Molnar mi...@kernel.org Cc: Thomas Gleixner t...@linutronix.de Cc: Steven Rostedt rost...@goodmis.org Cc: Paul Gortmaker

[PATCH 3/5] vtime: Consolidate a bit the ctx switch code

2012-11-20 Thread Frederic Weisbecker
own implementation. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Reviewed-by: Steven Rostedt rost...@goodmis.org Cc: Peter Zijlstra pet...@infradead.org Cc: Ingo Molnar mi...@kernel.org Cc: Thomas Gleixner t...@linutronix.de Cc: Steven Rostedt rost...@goodmis.org Cc: Paul Gortmaker

[PATCH 0/3] cputime: Cleanups on adjusted cputime code

2012-11-23 Thread Frederic Weisbecker
Hi, Not vtime related that time, just a few supplementary cleanups on the part that computes the adjusted cputime values. Thanks. Frederic Weisbecker (3): cputime: Move thread_group_cputime() to sched code cputime: Rename thread_group_times to thread_group_cputime_adjusted cputime

[PATCH 2/3] cputime: Rename thread_group_times to thread_group_cputime_adjusted

2012-11-23 Thread Frederic Weisbecker
of thread_group_cputime() that does some stabilization on the raw cputime values. ie here: scale on top of CFS runtime stats and bound lower value for monotonicity. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Ingo Molnar mi...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Thomas Gleixner t

[PATCH 3/3] cputime: Consolidate cputime adjustment code

2012-11-23 Thread Frederic Weisbecker
in the group and the previous adjusted snapshot of the whole group from the signal structure. Just consolidate the common code that does the adjustment. These functions just need to fetch the values from the appropriate source. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Ingo Molnar mi

[PATCH 1/3] cputime: Move thread_group_cputime() to sched code

2012-11-23 Thread Frederic Weisbecker
thread_group_cputime() is a general cputime API that is not only used by posix cpu timer. Let's move this helper to sched code. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Ingo Molnar mi...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Thomas Gleixner t...@linutronix.de Cc

Re: [PATCH 3/3] cputime: Consolidate cputime adjustment code

2012-11-25 Thread Frederic Weisbecker
2012/11/25 Paul Gortmaker paul.gortma...@windriver.com: --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -433,6 +433,11 @@ struct cpu_itimer { u32 incr_error; }; +struct cputime { + cputime_t utime; + cputime_t stime; +}; + Hi Frederic, This new struct

Re: [PATCH 2/3] cputime: Rename thread_group_times to thread_group_cputime_adjusted

2012-11-26 Thread Frederic Weisbecker
2012/11/26 Steven Rostedt rost...@goodmis.org: On Fri, 2012-11-23 at 15:21 +0100, Frederic Weisbecker wrote: We have thread_group_cputime() and thread_group_times(). The naming doesn't provide enough information about the difference between these two APIs. To lower the confusion, rename

Re: [PATCH 1/3] context_tracking: New context tracking susbsystem

2012-11-26 Thread Frederic Weisbecker
2012/11/6 Gilad Ben-Yossef gi...@benyossef.com: On Sat, Nov 3, 2012 at 6:09 PM, Frederic Weisbecker fweis...@gmail.com wrote: diff --git a/arch/Kconfig b/arch/Kconfig index 366ec06..3855e06 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -300,15 +300,15 @@ config SECCOMP_FILTER

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

2012-11-27 Thread Frederic Weisbecker
2012/11/27 Li Zhong zh...@linux.vnet.ibm.com: I noticed some warnings complaining about dynticks_nesting value, like [ 267.545032] [ cut here ] [ 267.545032] WARNING: at kernel/rcutree.c:382 rcu_eqs_enter+0xab/0xc0() [ 267.545032] Hardware name: Bochs [

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

2012-11-27 Thread Frederic Weisbecker
2012/11/27 Gleb Natapov g...@redhat.com: On Tue, Nov 27, 2012 at 01:15:25PM +0800, Li Zhong wrote: @@ -247,10 +247,17 @@ do_async_page_fault(struct pt_regs *regs, unsigned long error_code) break; case KVM_PV_REASON_PAGE_NOT_PRESENT: /* page is swapped out

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

2012-11-27 Thread Frederic Weisbecker
2012/11/27 Li Zhong zh...@linux.vnet.ibm.com: I noticed some warnings complaining about dynticks_nesting value, like [ 267.545032] [ cut here ] [ 267.545032] WARNING: at kernel/rcutree.c:382 rcu_eqs_enter+0xab/0xc0() [ 267.545032] Hardware name: Bochs [

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

2012-11-27 Thread Frederic Weisbecker
2012/11/27 Gleb Natapov g...@redhat.com: For KVM_PV_REASON_PAGE_NOT_PRESENT it behaves like an exception. Ok. There seem to be a bug in kvm_async_pf_task_wait(). Using idle_cpu(cpu) to find out if the current task is the idle task may not work if there is pending wake up. Me may schedule another

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

2012-11-27 Thread Frederic Weisbecker
2012/11/27 Li Zhong zh...@linux.vnet.ibm.com: @@ -247,10 +247,17 @@ do_async_page_fault(struct pt_regs *regs, unsigned long error_code) break; case KVM_PV_REASON_PAGE_NOT_PRESENT: /* page is swapped out by the host. */ - rcu_irq_enter();

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

2012-11-27 Thread Frederic Weisbecker
2012/11/27 Paul E. McKenney paul...@linux.vnet.ibm.com: It is OK to call rcu_irq_exit() without a matching rcu_irq_enter() -only- if you have also called rcu_idle_exit() since the last rcu_idle_enter(). There will be a similar rule for rcu_user_exit(). More generally, it is OK to call

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

2012-11-27 Thread Frederic Weisbecker
2012/11/27 Li Zhong zh...@linux.vnet.ibm.com: I noticed some warnings complaining about dynticks_nesting value, like [ 267.545032] [ cut here ] [ 267.545032] WARNING: at kernel/rcutree.c:382 rcu_eqs_enter+0xab/0xc0() [ 267.545032] Hardware name: Bochs [

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

2012-11-27 Thread Frederic Weisbecker
2012/11/27 Paul E. McKenney paul...@linux.vnet.ibm.com: On Tue, Nov 27, 2012 at 04:56:30PM +0100, Frederic Weisbecker wrote: 2012/11/27 Gleb Natapov g...@redhat.com: For KVM_PV_REASON_PAGE_NOT_PRESENT it behaves like an exception. Ok. There seem to be a bug in kvm_async_pf_task_wait

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

2012-11-27 Thread Frederic Weisbecker
2012/11/27 Gleb Natapov g...@redhat.com: On Tue, Nov 27, 2012 at 04:56:30PM +0100, Frederic Weisbecker wrote: 2012/11/27 Gleb Natapov g...@redhat.com: For KVM_PV_REASON_PAGE_NOT_PRESENT it behaves like an exception. Ok. There seem to be a bug in kvm_async_pf_task_wait(). Using idle_cpu(cpu

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

2012-11-27 Thread Frederic Weisbecker
2012/11/27 Gleb Natapov g...@redhat.com: diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 4180a87..636800d 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -113,7 +113,7 @@ void kvm_async_pf_task_wait(u32 token) int cpu, idle; cpu =

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

2012-11-27 Thread Frederic Weisbecker
2012/11/27 Gleb Natapov g...@redhat.com: On Tue, Nov 27, 2012 at 06:30:32PM +0100, Frederic Weisbecker wrote: 2012/11/27 Gleb Natapov g...@redhat.com: diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 4180a87..636800d 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

2012-11-27 Thread Frederic Weisbecker
2012/11/27 Gleb Natapov g...@redhat.com: What is the semantics of enter_idle()/exit_idle(), what are they used for? It's used by drivers/idle/i7300_idle.c for some tracking. I don't know much the details. enter_idle() is called right before the CPU is set to lower power mode: hlt() exit_idle()

Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault

2012-11-27 Thread Frederic Weisbecker
2012/11/27 Frederic Weisbecker fweis...@gmail.com: 2012/11/27 Gleb Natapov g...@redhat.com: What is the semantics of enter_idle()/exit_idle(), what are they used for? It's used by drivers/idle/i7300_idle.c for some tracking. I don't know much the details. enter_idle() is called right before

Re: [PATCH 2/3] cputime: Rename thread_group_times to thread_group_cputime_adjusted

2012-11-27 Thread Frederic Weisbecker
2012/11/26 Steven Rostedt rost...@goodmis.org: OK, let's take a look at the other version now: void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *st) So this does the same thing than thread_group_cputime(), ie: fetch the raw cputime stats from the task/signal struct, with

Re: [RFC PATCH v2] Add rcu user eqs exception hooks for async page fault

2012-11-28 Thread Frederic Weisbecker
2012/11/28 Li Zhong zh...@linux.vnet.ibm.com: Thank you all for the review and education. Below are my current understandings and an update version. Would you please help to review it again and give your comments? Thanks, Zhong Now it seems to me that it is legal to call

Re: [RFC PATCH v2] Add rcu user eqs exception hooks for async page fault

2012-11-28 Thread Frederic Weisbecker
2012/11/28 Gleb Natapov g...@redhat.com: On Wed, Nov 28, 2012 at 01:55:42PM +0100, Frederic Weisbecker wrote: Yes but if rcu_irq_*() calls are fine to be called there, and I believe they are because exception_enter() exits the user mode, we should start to protect there right now instead

[PATCH 0/4] cputime: Cleanups on adjusted cputime code v2

2012-11-28 Thread Frederic Weisbecker
Hi, Changes since v1 address Steven Rostedt and Paul Gortmaker reviews: - More comments to distinguish struct cputime / struct task_cputime [3/4] - Comment the reasons and the details for cputime adjustment [4/4] Thanks. Frederic Weisbecker (4): cputime: Move thread_group_cputime() to sched

[PATCH 1/4] cputime: Move thread_group_cputime() to sched code

2012-11-28 Thread Frederic Weisbecker
thread_group_cputime() is a general cputime API that is not only used by posix cpu timer. Let's move this helper to sched code. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Ingo Molnar mi...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Thomas Gleixner t...@linutronix.de Cc

[PATCH 3/4] cputime: Consolidate cputime adjustment code

2012-11-28 Thread Frederic Weisbecker
in the group and the previous adjusted snapshot of the whole group from the signal structure. Just consolidate the common code that does the adjustment. These functions just need to fetch the values from the appropriate source. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Ingo Molnar mi

[PATCH 2/4] cputime: Rename thread_group_times to thread_group_cputime_adjusted

2012-11-28 Thread Frederic Weisbecker
of thread_group_cputime() that does some stabilization on the raw cputime values. ie here: scale on top of CFS runtime stats and bound lower value for monotonicity. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Ingo Molnar mi...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Thomas Gleixner t

[PATCH 4/4] cputime: Comment cputime's adjusting code

2012-11-28 Thread Frederic Weisbecker
The reason for the scaling and monotonicity correction performed by cputime_adjust() may not be immediately clear to the reviewer. Add some comments to explain what happens there. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Ingo Molnar mi...@kernel.org Cc: Peter Zijlstra pet

[PATCH 03/24] cputime: Allow dynamic switch between tick/virtual based cputime accounting

2012-12-20 Thread Frederic Weisbecker
anytime in order to minimize the overhead associated to user hooks. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c

[PATCH 06/24] nohz: Basic full dynticks interface

2012-12-20 Thread Frederic Weisbecker
. McKenney paul...@linux.vnet.ibm.com Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c...@linux.com Cc: Geoff Levand ge

[PATCH 04/24] cputime: Use accessors to read task cputime stats

2012-12-20 Thread Frederic Weisbecker
This is in preparation for the full dynticks feature. While remotely reading the cputime of a task running in a full dynticks CPU, we'll need to do some extra-computation. This way we can account the time it spent tickless in userspace since its last cputime snapshot. Signed-off-by: Frederic

[PATCH 07/24] nohz: Assign timekeeping duty to a non-full-nohz CPU

2012-12-20 Thread Frederic Weisbecker
running. But let's use this KISS solution for now. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c...@linux.com Cc

[PATCH 11/24] sched: Comment on rq-clock correctness in ttwu_do_wakeup() in nohz

2012-12-20 Thread Frederic Weisbecker
Just to avoid confusion. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c...@linux.com Cc: Geoff Levand ge

[PATCH 13/24] sched: Update rq clock on nohz CPU before setting fair group shares

2012-12-20 Thread Frederic Weisbecker
tickless because scheduler_tick() is not there to maintain it. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c

[PATCH 20/24] nohz: Full dynticks mode

2012-12-20 Thread Frederic Weisbecker
things to be done from scheduler_tick()] [ Included build fix from Geoff Levand ] Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc

[PATCH 19/24] nohz: Move nohz load balancer selection into idle logic

2012-12-20 Thread Frederic Weisbecker
[ ** BUGGY PATCH: I need to put more thinking into this ** ] We want the nohz load balancer to be an idle CPU, thus move that selection to strict dyntick idle logic. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux

[PATCH 18/24] sched: Update nohz rq clock before searching busiest group on load balancing

2012-12-20 Thread Frederic Weisbecker
is in dyntick-idle mode? Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c...@linux.com Cc: Geoff Levand ge

[PATCH 09/24] nohz: Wake up full dynticks CPUs when a timer gets enqueued

2012-12-20 Thread Frederic Weisbecker
Wake up a CPU when a timer list timer is enqueued there and the CPU is in full dynticks mode. Sending an IPI to it makes it reconsidering the next timer to program on top of recent updates. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew

[PATCH 10/24] rcu: Restart the tick on non-responding full dynticks CPUs

2012-12-20 Thread Frederic Weisbecker
When a CPU in full dynticks mode doesn't respond to complete a grace period, issue it a specific IPI so that it restarts the tick and chases a quiescent state. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux

[PATCH 17/24] sched: Update rq clock before idle balancing

2012-12-20 Thread Frederic Weisbecker
idle_balance() is called from schedule() right before we schedule the idle task. It needs to record the idle timestamp at that time and for this the rq clock must be accurate. If the CPU is running tickless we need to update the rq clock manually. Signed-off-by: Frederic Weisbecker fweis

[ANNOUNCE] 3.7-nohz1

2012-12-20 Thread Frederic Weisbecker
isolation: https://github.com/gby/linux/wiki But keep in mind that my tree is not yet ready for serious production. Happy Christmas, new year or whatever end of the world. --- Frederic Weisbecker (32): irq_work: Fix racy IRQ_WORK_BUSY flag setting irq_work: Fix racy check on work

[PATCH 08/24] nohz: Trace timekeeping update

2012-12-20 Thread Frederic Weisbecker
Not for merge. This may become a real tracepoint. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c...@linux.com

[PATCH 15/24] sched: Update rq clock earlier in unthrottle_cfs_rq

2012-12-20 Thread Frederic Weisbecker
In this function we are making use of rq-clock right before the update of the rq clock, let's just call update_rq_clock() just before that to avoid using a stale rq clock value. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton

[PATCH 16/24] sched: Update clock of nohz busiest rq before balancing

2012-12-20 Thread Frederic Weisbecker
clock before reading it. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c...@linux.com Cc: Geoff Levand ge

[PATCH 22/24] nohz: Don't turn off the tick if rcu needs it

2012-12-20 Thread Frederic Weisbecker
: OTOH we don't want to handle a locally started grace period, this should be offloaded for rcu_nocb CPUs. What we want is to be kicked if we stay dynticks in the kernel for too long (ie: to report a quiescent state). rcu_pending() is perhaps an overkill just for that. Signed-off-by: Frederic

[PATCH 12/24] sched: Update rq clock on nohz CPU before migrating tasks

2012-12-20 Thread Frederic Weisbecker
Because the sched_class::put_prev_task() callback of rt and fair classes are referring to the rq clock to update their runtime statistics. A CPU running in tickless mode may carry a stale value. We need to update it there. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor

[PATCH 21/24] nohz: Only stop the tick on RCU nocb CPUs

2012-12-20 Thread Frederic Weisbecker
On a full dynticks CPU, we want the RCU callbacks to be offlined to another CPU, otherwise we need to keep the tick to wait for the grace period completion. Ensure the full dynticks CPU is also an rcu_nocb one. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog

[PATCH 05/24] cputime: Safely read cputime of full dynticks CPUs

2012-12-20 Thread Frederic Weisbecker
-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c...@linux.com Cc: Geoff Levand ge...@infradead.org Cc: Gilad Ben Yossef gi

[PATCH 02/24] cputime: Generic on-demand virtual cputime accounting

2012-12-20 Thread Frederic Weisbecker
native virtual based cputime accounting which hooks on low level code and use a cpu hardware clock. Precision is not the goal of this though. - There is probably more overhead than a native virtual based cputime accounting. But this relies on hooks that are already set anyway. Signed-off-by: Frederic

[PATCH 23/24] nohz: Don't stop the tick if posix cpu timers are running

2012-12-20 Thread Frederic Weisbecker
-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c...@linux.com Cc: Geoff Levand ge...@infradead.org Cc: Gilad Ben Yossef gi

[PATCH 14/24] sched: Update rq clock on tickless CPUs before calling check_preempt_curr()

2012-12-20 Thread Frederic Weisbecker
manually in case the CPU runs tickless because ttwu_do_wakeup() calls check_preempt_wakeup(). Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc

[PATCH 24/24] nohz: Add some tracing

2012-12-20 Thread Frederic Weisbecker
Not for merge, just for debugging. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c...@linux.com Cc: Geoff

[PATCH 01/24] context_tracking: Add comments on interface and internals

2012-12-20 Thread Frederic Weisbecker
This subsystem lacks many explanations on its purpose and design. Add these missing comments. Reported-by: Andrew Morton a...@linux-foundation.org Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi

[PATCH] profiling: Remove unused timer hook

2012-12-22 Thread Frederic Weisbecker
out of tree user. Let's remove it. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Avi Kivity a...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c...@linux.com Cc: Geoff Levand

Re: [PATCH 07/24] nohz: Assign timekeeping duty to a non-full-nohz CPU

2012-12-22 Thread Frederic Weisbecker
2012/12/21 Steven Rostedt rost...@goodmis.org: On Thu, 2012-12-20 at 19:32 +0100, Frederic Weisbecker wrote: kernel/time/tick-sched.c:517:6: error: have_full_nohz_mask undeclared (first use in this function) --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -112,7 +112,8

Re: [PATCH 03/24] cputime: Allow dynamic switch between tick/virtual based cputime accounting

2012-12-22 Thread Frederic Weisbecker
2012/12/21 Steven Rostedt rost...@goodmis.org: @@ -601,6 +612,7 @@ void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut, cputime thread_group_cputime(p, cputime); cputime_adjust(cputime, p-signal-prev_cputime, ut, st); } +#endif /*

Re: [PATCH 05/24] cputime: Safely read cputime of full dynticks CPUs

2012-12-22 Thread Frederic Weisbecker
2012/12/21 Steven Rostedt rost...@goodmis.org: On Thu, 2012-12-20 at 19:32 +0100, Frederic Weisbecker wrote: --- a/include/linux/init_task.h +++ b/include/linux/init_task.h @@ -10,6 +10,7 @@ #include linux/pid_namespace.h #include linux/user_namespace.h #include linux/securebits.h

Re: [ANNOUNCE] 3.7-nohz1

2012-12-23 Thread Frederic Weisbecker
2012/12/21 Steven Rostedt rost...@goodmis.org: On Thu, 2012-12-20 at 19:32 +0100, Frederic Weisbecker wrote: Let's imagine you have 4 CPUs. We keep the CPU 0 to offline RCU callbacks there and to handle the timekeeping. We set the rest as full dynticks. So you need the following kernel

Re: [PATCH 02/24] cputime: Generic on-demand virtual cputime accounting

2012-12-29 Thread Frederic Weisbecker
2012/12/26 Li Zhong zh...@linux.vnet.ibm.com: On Thu, 2012-12-20 at 19:32 +0100, Frederic Weisbecker wrote: diff --git a/init/Kconfig b/init/Kconfig index 60579d6..a64b3e8 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -340,7 +340,9 @@ config TICK_CPU_ACCOUNTING config

Re: [PATCH 20/24] nohz: Full dynticks mode

2012-12-29 Thread Frederic Weisbecker
2012/12/26 Namhyung Kim namhy...@kernel.org: Hi Frederic, On Thu, 20 Dec 2012 19:33:07 +0100, Frederic Weisbecker wrote: When a CPU is in full dynticks mode, try to switch it to nohz mode from the interrupt exit path if it is running a single non-idle task. Then restart the tick

[ANNOUNCE] 3.8-rc1-nohz1

2012-12-29 Thread Frederic Weisbecker
. Happy new year! --- Frederic Weisbecker (35): irq_work: Fix racy IRQ_WORK_BUSY flag setting irq_work: Fix racy check on work pending flag irq_work: Remove CONFIG_HAVE_IRQ_WORK nohz: Add API to check tick state irq_work: Don't stop the tick with pending works

[PATCH 02/27] cputime: Generic on-demand virtual cputime accounting

2012-12-29 Thread Frederic Weisbecker
native virtual based cputime accounting which hooks on low level code and use a cpu hardware clock. Precision is not the goal of this though. - There is probably more overhead than a native virtual based cputime accounting. But this relies on hooks that are already set anyway. Signed-off-by: Frederic

[PATCH 04/27] cputime: Use accessors to read task cputime stats

2012-12-29 Thread Frederic Weisbecker
This is in preparation for the full dynticks feature. While remotely reading the cputime of a task running in a full dynticks CPU, we'll need to do some extra-computation. This way we can account the time it spent tickless in userspace since its last cputime snapshot. Signed-off-by: Frederic

[PATCH 07/27] nohz: Assign timekeeping duty to a non-full-nohz CPU

2012-12-29 Thread Frederic Weisbecker
running. But let's use this KISS solution for now. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Cc: Alessio Igor Bogani abog...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Chris Metcalf cmetc...@tilera.com Cc: Christoph Lameter c...@linux.com Cc: Geoff Levand ge

<    2   3   4   5   6   7   8   9   10   11   >