Hi,
Here the current patches that rework load_balance_monitor.
The main reason for doing this is to eliminate the wakeups the thing generates,
esp. on an idle system. The bonus is that it removes a kernel thread.
Paul, Gregory - the thing that bothers me most atm is the lack of
rd->load_balance.
Currently the lb_monitor will walk all the domains/cpus from a single
cpu's timer interrupt. This will cause massive cache-trashing and cache-line
bouncing on larger machines.
Split the lb_monitor into root_domain (disjoint sched-domains).
Signed-off-by: Peter Zijlstra <[EMAIL PROTEC
Hi Ingo,
Would you stick these into sched-devel.
The first patch should address the latency isolation issue. While the second
rectifies a massive brainfart :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordom
meet. This includes the latency into
the scheduling decision.
[*] - EDF is correct up until load 1, after that it is not a closed system so
improvement is possible here. It is usable because the system strives to
generate the load 1 situation.
Signed-off-by: Peter Zijlstra <[EMAIL PROTEC
The goal of calc_delta_asym() is to be asymetrically around NICE_0_LOAD, in
that it favours >=0 over <0. The current implementation does not achieve that.
-20 |
|
0 +---
.'
19 .'
Signed-off-by: Peter Zijlstra <[EMAIL PROT
On Fri, 2008-02-15 at 12:43 +0100, Miklos Szeredi wrote:
> - strange key repeating (short press of a key results in lots of key
>press events) when there's some sort of load (I/O?) I may have
>seen this on non-mm kernels as well, but it's definitely more
>noticable in -mm
Do you ha
On Thu, 2008-02-14 at 12:30 -0600, Corey Minyard wrote:
> +/*
> + * Various statistics for IPMI, these index stats[] in the ipmi_smi
> + * structure.
> + */
> +/* Commands we got from the user that were invalid. */
> +#define IPMI_STAT_sent_invalid_commands 0
> +
> +/* Comman
On Fri, 2008-02-15 at 11:46 -0500, Gregory Haskins wrote:
> Peter Zijlstra wrote:
>
> > @@ -6342,8 +6351,14 @@ static void rq_attach_root(struct rq *rq
> > cpu_clear(rq->cpu, old_rd->span);
> > cpu_clear(rq->cpu, old_rd-&
On Wed, 2012-07-25 at 15:40 -0700, Tejun Heo wrote:
> (cc'ing Oleg and Peter)
Right, if you're playing games with preemption, always add the rt and
sched folks.. added mingo and tglx.
> On Wed, Jul 25, 2012 at 03:35:32PM -0700, Peter Boonstoppel wrote:
> > After a kthread is created it signals th
On Thu, 2012-07-26 at 08:50 +0300, Gleb Natapov wrote:
> On Wed, Jul 25, 2012 at 10:35:46PM +0200, Peter Zijlstra wrote:
> > On Tue, 2012-07-24 at 18:15 +0200, Robert Richter wrote:
> > > David,
> > >
> > > On 24.07.12 08:20:19, David Ahern wrote:
> > &g
On Wed, 2012-07-25 at 23:16 -0600, David Ahern wrote:
> Peter's patch (see https://lkml.org/lkml/2012/7/9/298) changes kernel
> side to require the use of exclude_guest if the precise modifier is
> used, returning -EOPNOTSUPP if exclude_guest is not set. This patch goes
> after the user experie
On Wed, 2012-07-25 at 15:09 -0700, Hugh Dickins wrote:
> We find out after it hits us, and someone studies the disassembly -
> if we're lucky enough to crash near the origin of the problem.
This is a rather painful way.. see
https://lkml.org/lkml/2009/1/5/555
we were lucky there in that the l
On Thu, 2012-07-26 at 13:27 +0800, Alex Shi wrote:
> If find_idlest_cpu() return '-1', and sd->child is NULL. The function
> select_task_rq_fair will return -1. That is not the function's purpose.
But find_idlest_cpu() will only return -1 if the group mask is fully
excluded by the cpus_allowed mas
On Thu, 2012-07-26 at 17:54 +0200, Oleg Nesterov wrote:
> Yes, but this "avoid the preemption after wakeup" can actually help
> kthread_bind()->wait_task_inactive() ?
Yeah.
> This reminds me, Peter had a patch which teaches wait_task_inactive()
> to use sched_in/sched_out notifiers to avoid the p
__do_huge_pmd_anonymous_page() contains:
/*
* The spinlocking to take the lru_lock inside
* page_add_new_anon_rmap() acts as a full memory
* barrier to be sure clear_huge_page writes become
* visible after the set
On Thu, 2012-07-26 at 22:31 +0200, Peter Zijlstra wrote:
> __do_huge_pmd_anonymous_page() contains:
>
> /*
> * The spinlocking to take the lru_lock inside
> * page_add_new_anon_rmap() acts as a full memory
> *
On Tue, 2012-07-24 at 14:51 -0700, Hugh Dickins wrote:
> I do love the status quo, but an audit would be welcome. When
> it comes to patches, personally I tend to prefer ACCESS_ONCE() and
> smp_read_barrier_depends() and accompanying comments to be hidden away
> in the underlying macros or inlines
On Tue, 2012-07-24 at 15:06 +0100, Ben Hutchings wrote:
> On Mon, 2012-07-23 at 02:07 +0100, Ben Hutchings wrote:
> > 3.2-stable review patch. If anyone has any objections, please let me know.
> >
> > --
> >
> > From
On Thu, 2012-07-26 at 23:01 +0100, Ben Hutchings wrote:
>
> That's what I thought, so I went ahead with just the one.
> Should I queue up the other two for a future 3.2.y update?
Yeah, why not..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message
On Fri, 2012-07-27 at 09:47 +0800, Alex Shi wrote:
> From 610515185d8a98c14c7c339c25381bc96cd99d93 Mon Sep 17 00:00:00 2001
> From: Alex Shi
> Date: Thu, 26 Jul 2012 08:55:34 +0800
> Subject: [PATCH 1/3] sched: recover SD_WAKE_AFFINE in select_task_rq_fair and
> code clean up
>
> Since power sa
On Fri, 2012-07-20 at 05:31 -0700, Michel Lespinasse wrote:
> --- a/lib/rbtree.c
> +++ b/lib/rbtree.c
> @@ -88,7 +88,8 @@ __rb_rotate_set_parents(struct rb_node *old, struct rb_node
> *new,
> root->rb_node = new;
> }
>
> -void rb_insert_color(struct rb_node *node, struct rb_root
On Fri, 2012-07-20 at 05:31 -0700, Michel Lespinasse wrote:
>
> rb_insert_color() is now a special case of rb_insert_augmented() with
> a do-nothing callback. I used inlining to optimize out the callback,
> with the intent that this would generate the same code as previously
> for rb_insert_augmen
On Fri, 2012-07-20 at 05:31 -0700, Michel Lespinasse wrote:
> --- a/lib/rbtree_test.c
> +++ b/lib/rbtree_test.c
> @@ -1,5 +1,6 @@
> #include
> #include
> +#include
This confuses me.. either its internal to the rb-tree implementation and
users don't need to see it, or its not in which case hu
On Fri, 2012-07-20 at 05:31 -0700, Michel Lespinasse wrote:
> +static inline void
> +rb_erase_augmented(struct rb_node *node, struct rb_root *root,
> + rb_augment_propagate *augment_propagate,
> + rb_augment_rotate *augment_rotate)
So why put all this in a static
On Fri, 2012-07-20 at 05:31 -0700, Michel Lespinasse wrote:
> +static void augment_rotate(struct rb_node *rb_old, struct rb_node *rb_new)
> +{
> + struct test_node *old = rb_entry(rb_old, struct test_node, rb);
> + struct test_node *new = rb_entry(rb_new, struct test_node, rb);
> +
> +
On Fri, 2012-07-27 at 17:40 +0200, Frederic Weisbecker wrote:
> +++ b/kernel/user_hooks.c
> @@ -0,0 +1,56 @@
> +#include
> +#include
> +#include
> +#include
> +
> +struct user_hooks {
> + bool hooking;
> + bool in_user;
> +};
I really detest using bool in structures.. but that's ju
On Mon, 2012-07-30 at 11:27 -0400, Steven Rostedt wrote:
> I'm curious to what you have against bool in structures?
_Bool as per the C std doesn't have a specified storage. Now IIRC hpa
recently said that all GCC versions so far were consistent and used char
(a byte) for it, but I might mis-rememb
On Mon, 2012-07-30 at 12:07 -0400, Steven Rostedt wrote:
>
> Would 'is_hooked' be better? 'is_hooking' sounds more like what women in
> high heels, really short skirts and lots of makeup are doing late night
> on a corner of a Paris street ;-)
This is exactly the first thing I though of when I re
On Mon, 2012-07-30 at 12:07 -0400, Steven Rostedt wrote:
>
> Not only does bool describe it better, it should also allow gcc to
> optimize it better as well. Unless Peter has a legitimate rational why
> using bool in struct is bad, I would keep it as is.
I don't mind too much, but like said, I h
On Tue, 2012-07-31 at 16:57 +0200, Ingo Molnar wrote:
>
> 'callback', while a longer word, is almost always used as a noun
> within the kernel - and it also has a pretty narrow meaning.
An altogether different naming would be something like:
struct user_kernel_tracking {
int want_uk_tr
Hi all,
After having had a talk with Rik about all this NUMA nonsense where he proposed
the scheme implemented in the next to last patch, I came up with a related
means of doing the home-node selection.
I've also switched to (ab)using PROT_NONE for driving the migration faults.
These patches go
Signed-off-by: Peter Zijlstra
---
arch/x86/include/asm/pgtable.h |1
mm/huge_memory.c | 104 +++--
2 files changed, 50 insertions(+), 55 deletions(-)
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -350,6 +350,7
A_HIT fully includes NUMA_INTERLEAVE_HIT so users might
switch to using that.
This cleans up some of the weird MPOL_INTERLEAVE allocation exceptions.
Cc: Lee Schermerhorn
Cc: Rik van Riel
Cc: Andrew Morton
Cc: Linus Torvalds
Signed-off-by: Peter Zijlstra
---
drivers/base/node.c|2 -
in
Add migrate_misplaced_page() which deals with migrating pages from
faults. This includes adding a new MIGRATE_FAULT migration mode to
deal with the extra page reference required due to having to look up
the page.
Based-on-work-by: Lee Schermerhorn
Cc: Paul Turner
Signed-off-by: Peter Zijlstra
parts.
Cc: Paul Turner
Cc: Lee Schermerhorn
Cc: Christoph Lameter
Cc: Rik van Riel
Cc: Andrew Morton
Cc: Linus Torvalds
Signed-off-by: Peter Zijlstra
---
include/linux/sched.h |1
kernel/sched/core.c | 21 +++-
kernel/sched/debug.c|3
kernel/sched/fair.c | 236
urrent heuristic for determining if a task is 'big' is if its
consuming more than 1/2 a node's worth of cputime. We might want to
add a term here looking at the RSS of the process and compare this
against the available memory per node.
Cc: Rik van Riel
Cc: Paul Turner
Signed-off-by:
rred[NEW]
- default_policy
Note that the tsk_home_node() policy has Migrate-on-Fault enabled to
facilitate efficient on-demand memory migration.
Cc: Paul Turner
Cc: Lee Schermerhorn
Cc: Christoph Lameter
Cc: Rik van Riel
Cc: Andrew Morton
Cc: Linus Torvalds
Signed-off-by: Peter Zijlstra
---
mm
Its a bit awkward but it was the least painful means of modifying the
queue selection. Used in a later patch to conditionally use a random
queue.
Cc: Paul Turner
Cc: Lee Schermerhorn
Cc: Christoph Lameter
Cc: Rik van Riel
Cc: Andrew Morton
Cc: Linus Torvalds
Signed-off-by: Peter Zijlstra
Suggested-by: Rik van Riel
Cc: Paul Turner
Signed-off-by: Peter Zijlstra
---
include/linux/huge_mm.h |3 +
include/linux/mempolicy.h |4 +-
include/linux/mm.h| 12 ++
mm/huge_memory.c | 21 +++
mm/memory.c | 86
on
Cc: Linus Torvalds
Signed-off-by: Peter Zijlstra
---
include/linux/mempolicy.h |1 +
mm/mempolicy.c|8
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
index 87fabfa..668311a 100644
--- a/include/l
ff-by: Lee Schermerhorn
Cc: Rik van Riel
Cc: Andrew Morton
Cc: Linus Torvalds
[ Added MPOL_F_LAZY to trigger migrate-on-fault;
simplified code now that we don't have to bother
with special crap for interleaved ]
Signed-off-by: Peter Zijlstra
---
include/linux/mempolicy.h |9
nodes.
After unmap, the pages in regions assigned to the worker threads
will be automatically migrated local to the threads on 1st touch.
Signed-off-by: Lee Schermerhorn
Cc: Lee Schermerhorn
Cc: Rik van Riel
Cc: Andrew Morton
Cc: Linus Torvalds
[ nearly complete rewrite.. ]
Signed-off-by: Peter
Combine our previous PROT_NONE, mpol_misplaced and
migrate_misplaced_page() pieces into an effective migrate on fault
scheme.
Suggested-by: Rik van Riel
Cc: Paul Turner
Signed-off-by: Peter Zijlstra
---
mm/huge_memory.c | 41 -
mm/memory.c | 42
Avoid a few #ifdef's later on.
Cc: Paul Turner
Cc: Lee Schermerhorn
Cc: Christoph Lameter
Cc: Rik van Riel
Cc: Andrew Morton
Cc: Linus Torvalds
Signed-off-by: Peter Zijlstra
---
kernel/sched/sched.h |6 ++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/kernel/
better than no memory.
This patch merely introduces the basic infrastructure, all policy
comes later.
Cc: Lee Schermerhorn
Cc: Rik van Riel
Cc: Andrew Morton
Cc: Linus Torvalds
Signed-off-by: Peter Zijlstra
---
include/linux/init_task.h |8
include/linux/sched.h | 10
constraints will try
and move it away.
The balance between these two 'forces' is what will result in the NUMA
placement.
Cc: Rik van Riel
Cc: Paul Turner
Signed-off-by: Peter Zijlstra
---
include/linux/init_task.h |3
include/linux/mm_types.h |3
include/linux/sched.h
currently running on, since the home-node is the long term target
for the task to run on, irrespective of whatever node it might
temporarily run on.
Suggested-by: Rik van Riel
Cc: Paul Turner
Signed-off-by: Peter Zijlstra
---
include/linux/mempolicy.h |6 ++
Make MPOL_LOCAL a real and exposed policy such that applications that
relied on the previous default behaviour can explicitly request it.
Requested-by: Christoph Lameter
Cc: Lee Schermerhorn
Cc: Rik van Riel
Cc: Andrew Morton
Cc: Linus Torvalds
Signed-off-by: Peter Zijlstra
---
include
-by: Rik van Riel
Signed-off-by: Peter Zijlstra
---
include/linux/mm_types.h |2 ++
kernel/sched/core.c |2 ++
kernel/sched/fair.c | 19 ++-
mm/memory.c | 15 ---
4 files changed, 34 insertions(+), 4 deletions(-)
--- a/include/linux
Remove the need for sched.h from task_work.h so that we can use struct
task_work in struct task_struct in a later patch.
Cc: Oleg Nesterov
Signed-off-by: Peter Zijlstra
---
include/linux/task_work.h |7 ---
kernel/exit.c |5 -
2 files changed, 4 insertions(+), 8
On Sun, 2012-09-23 at 22:19 +0200, Oleg Nesterov wrote:
> A separate patch for better documentation.
>
> set_swbp()->is_swbp_at_addr() is not needed for correctness, it is
> harmless to do the unnecessary __replace_page(old_page, new_page)
> when these 2 pages are identical.
>
> And it can not be
On Sun, 2012-09-23 at 22:19 +0200, Oleg Nesterov wrote:
> @@ -226,6 +245,10 @@ retry:
Could you use:
$ cat ~/.gitconfig
[diff "default"]
xfuncname = "^[[:alpha:]$_].*[^:]$"
This avoids git-diff it using labels as function names.
> if (ret <= 0)
> return ret;
>
>
Why are you cc'ing x86 and numa folks but not a single scheduler person
when you're patching scheduler stuff?
On Tue, 2012-09-18 at 18:12 +0800, Tang Chen wrote:
> Once array sched_domains_numa_masks is defined, it is never updated.
> When a new cpu on a new node is onlined,
Hmm, so there's hardw
On Mon, 2012-09-24 at 15:27 +0530, Srivatsa S. Bhat wrote:
> On 09/24/2012 03:08 PM, Peter Zijlstra wrote:
> >> + hotcpu_notifier(sched_domains_numa_masks_update,
> >> CPU_PRI_SCHED_ACTIVE);
> >> hotcpu_notifier(cpuset_cpu
On Fri, 2012-09-21 at 17:30 +0530, Raghavendra K T wrote:
> +unsigned long rq_nr_running(void)
> +{
> + return this_rq()->nr_running;
> +}
> +EXPORT_SYMBOL(rq_nr_running);
Uhm,.. no, that's a horrible thing to export.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel
On Fri, 2012-09-21 at 17:29 +0530, Raghavendra K T wrote:
> In some special scenarios like #vcpu <= #pcpu, PLE handler may
> prove very costly, because there is no need to iterate over vcpus
> and do unsuccessful yield_to burning CPU.
What's the costly thing? The vm-exit, the yield (which should
On Tue, 2012-09-18 at 01:03 +0200, Daniel Vetter wrote:
> - In the printk code there's a special trylock, only used to kick off
> the logbuffer printk'ing in console_unlock. But all that happens
> while lockdep is disable (since printk does a few other evil
> tricks). So no issue there, eithe
On Mon, 2012-09-24 at 14:17 +0200, Peter Zijlstra wrote:
> On Tue, 2012-09-18 at 01:03 +0200, Daniel Vetter wrote:
> > - In the printk code there's a special trylock, only used to kick off
> > the logbuffer printk'ing in console_unlock. But all that happens
> >
On Mon, 2012-09-24 at 17:22 +0530, Raghavendra K T wrote:
> On 09/24/2012 05:04 PM, Peter Zijlstra wrote:
> > On Fri, 2012-09-21 at 17:29 +0530, Raghavendra K T wrote:
> >> In some special scenarios like #vcpu<= #pcpu, PLE handler may
> >> prove very costly, becau
On Mon, 2012-09-17 at 13:38 -0300, Rafael Aquini wrote:
> +static inline void assign_balloon_mapping(struct page *page,
> + struct address_space
> *mapping)
> +{
> + page->mapping = mapping;
> + smp_wmb();
> +}
> +
> +static inline void clear_ball
On Mon, 2012-09-24 at 14:54 +0200, Daniel Vetter wrote:
> I've read through the patches and I'm hoping you don't volunteer me to
> pick these up ... ;-)
Worth a try, right? :-)
> But there doesn't seem to be anything that would
> get worse through this lockdep annotation patch, right?
No indee
On Mon, 2012-09-24 at 18:59 +0530, Raghavendra K T wrote:
> However Rik had a genuine concern in the cases where runqueue is not
> equally distributed and lockholder might actually be on a different run
> queue but not running.
Load should eventually get distributed equally -- that's what the
loa
On Mon, 2012-09-24 at 16:00 +0100, Mel Gorman wrote:
> On Fri, Sep 14, 2012 at 02:42:44PM -0700, Linus Torvalds wrote:
> > On Fri, Sep 14, 2012 at 2:27 PM, Borislav Petkov wrote:
> > >
> > > as Nikolay says below, we have a regression in 3.6 with pgbench's
> > > benchmark in postgresql.
> > >
> >
On Mon, 2012-09-24 at 17:26 +0200, Avi Kivity wrote:
> I think this is a no-op these (CFS) days. To get schedule() to do
> anything, you need to wake up a task, or let time pass, or block.
> Otherwise it will see that nothing has changed and as far as it's
> concerned you're still the best task to
On Mon, 2012-09-24 at 17:43 +0200, Avi Kivity wrote:
> Wouldn't this correspond to the scheduler interrupt firing and causing a
> reschedule? I thought the timer was programmed for exactly the point in
> time that CFS considers the right time for a switch. But I'm basing
> this on my mental model
On Mon, 2012-09-24 at 17:51 +0200, Avi Kivity wrote:
> On 09/24/2012 03:54 PM, Peter Zijlstra wrote:
> > On Mon, 2012-09-24 at 18:59 +0530, Raghavendra K T wrote:
> >> However Rik had a genuine concern in the cases where runqueue is not
> >> equally distributed and lock
On Mon, 2012-09-24 at 17:58 +0200, Avi Kivity wrote:
> There is the TSC deadline timer mode of newer Intels. Programming the
> timer is a simple wrmsr, and it will fire immediately if it already
> expired. Unfortunately on AMDs it is not available, and on virtual
> hardware it will be slow (~1-2
On Mon, 2012-09-24 at 08:52 -0700, Linus Torvalds wrote:
> Your patch looks odd, though. Why do you use some complex initial
> value for 'candidate' (nr_cpu_ids) instead of a simple and readable
> one (-1)?
nr_cpu_ids is the typical no-value value for cpumask operations -- yes
this is annoying an
On Mon, 2012-09-24 at 08:52 -0700, Linus Torvalds wrote:
> And the whole "if we find any non-idle cpu, skip the whole domain"
> logic really seems a bit odd (that's not new to your patch, though).
> Can somebody explain what the whole point of that idiotically written
> function is?
So we're look
On Mon, 2012-09-24 at 18:10 +0200, Avi Kivity wrote:
> > Its also still a LAPIC write -- disguised as an MSR though :/
>
> It's probably a whole lot faster though.
I've been told its not, I haven't tried it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body o
On Mon, 2012-09-24 at 18:06 +0200, Avi Kivity wrote:
>
> We would probably need a ->sched_exit() preempt notifier to make this
> work. Peter, I know how much you love those, would it be acceptable?
Where exactly do you want this? TASK_DEAD? or another exit?
--
To unsubscribe from this list: sen
On Mon, 2012-09-24 at 09:30 -0700, Linus Torvalds wrote:
> On Mon, Sep 24, 2012 at 9:12 AM, Peter Zijlstra
> wrote:
> >
> > So we're looking for an idle cpu around @target. We prefer a cpu of an
> > idle core, since SMT-siblings share L[12] cache. The way we do
On Mon, 2012-09-24 at 09:33 -0700, Linus Torvalds wrote:
> Sure, the "scan bits" bitops will return ">= nr_cpu_ids" for the "I
> couldn't find a bit" thing, but that doesn't mean that everything else
> should.
Fair enough..
---
kernel/sched/fair.c | 42 +-
On Mon, 2012-09-24 at 18:54 +0200, Peter Zijlstra wrote:
> But let me try and come up with the list thing, I think we've
> actually got that someplace as well.
OK, I'm sure the below can be written better, but my brain is gone for
the day...
---
include/linux/sched.h | 1
On Tue, 2012-09-25 at 10:39 +0800, Tang Chen wrote:
> >> @@ -6765,11 +6773,64 @@ static void sched_init_numa(void)
> >> }
> >>
> >> sched_domain_topology = tl;
> >> +
> >> +sched_domains_numa_levels = level;
>
> And I set it to level here again.
>
But its already set there.. its set
On Tue, 2012-09-25 at 16:06 +0530, Viresh Kumar wrote:
> +/* sched-domain levels */
> +#define SD_SIBLING 0x01/* Only for CONFIG_SCHED_SMT */
> +#define SD_MC 0x02/* Only for CONFIG_SCHED_MC */
> +#define SD_BOOK0x04/* Only for CONFIG
On Tue, 2012-09-25 at 10:39 +0800, Tang Chen wrote:
> > We do this because nr_node_ids changed, right? This means the entire
> > distance table grew/shrunk, which means we should do the level scan
> > again.
>
> It seems that nr_node_ids will not change once the system is up.
> I'm not quite sure.
On Tue, 2012-09-25 at 12:44 +0200, Stephane Eranian wrote:
> Hi,
>
> I don't understand why the local variable box needs to
> be declared static here:
>
> static struct intel_uncore_box *
> uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu)
> {
> static struct intel_uncore_box *box;
On Tue, 2012-09-25 at 16:06 +0530, Viresh Kumar wrote:
> @@ -1066,8 +1076,9 @@ int queue_work(struct workqueue_struct *wq,
> struct work_struct *work)
> {
> int ret;
>
> - ret = queue_work_on(get_cpu(), wq, work);
> - put_cpu();
> + preempt_disable();
> + ret = qu
On Thu, 2012-09-20 at 13:03 -0400, Vince Weaver wrote:
> One additional complication: some of the cache events map to
> event "0". This causes problems because the generic events code
> assumes "0" means not-available. I'm not sure the best way to address
> that problem.
For all except P4 we
On Tue, 2012-09-25 at 17:00 +0530, Viresh Kumar wrote:
> But this is what the initial idea during LPC we had.
Yeah.. that's true.
> Any improvements here you can suggest?
We could uhm... /me tries thinking ... reuse some of the NOHZ magic?
Would that be sufficient, not waking a NOHZ cpu, or do
On Tue, 2012-09-25 at 13:40 +0200, Peter Zijlstra wrote:
> On Tue, 2012-09-25 at 17:00 +0530, Viresh Kumar wrote:
> > But this is what the initial idea during LPC we had.
>
> Yeah.. that's true.
>
> > Any improvements here you can suggest?
>
> We could uhm..
On Tue, 2012-09-25 at 19:45 +0800, Tang Chen wrote:
> Let's have an example here.
>
> sched_init_numa()
> {
> ...
> // A loop set sched_domains_numa_levels to level.-1
>
> // I set sched_domains_numa_levels to 0.
> sched_domains_numa_levels = 0;--
On Mon, 2012-09-24 at 19:11 -0700, Linus Torvalds wrote:
> In the not-so-distant past, we had the intel "Dunnington" Xeon, which
> was iirc basically three Core 2 duo's bolted together (ie three
> clusters of two cores sharing L2, and a fully shared L3). So that was
> a true multi-core with fairly
On Tue, 2012-09-25 at 15:42 +0400, Cyrill Gorcunov wrote:
> Guys, letme re-read this whole mail thread first since I have no clue
> what this remapping about ;)
x86_setup_perfctr() / set_ext_hw_attr() have special purposed 0 and -1
config values to mean -ENOENT and -EINVAL resp.
This means neith
On Tue, 2012-09-25 at 14:23 +0100, Mel Gorman wrote:
> It crashes on boot due to the fact that you created a function-scope variable
> called sd_llc in select_idle_sibling() and shadowed the actual sd_llc you
> were interested in.
D'0h!
--
To unsubscribe from this list: send the line "unsubscribe
On Wed, 2012-09-26 at 15:20 +0200, Andrew Jones wrote:
> Wouldn't a clean solution be to promote a task's scheduler
> class to the spinner class when we PLE (or come from some special
> syscall
> for userspace spinlocks?)?
Userspace spinlocks are typically employed to avoid syscalls..
> That cla
On Wed, 2012-09-26 at 15:39 +0200, Andrew Jones wrote:
> On Wed, Sep 26, 2012 at 03:26:11PM +0200, Peter Zijlstra wrote:
> > On Wed, 2012-09-26 at 15:20 +0200, Andrew Jones wrote:
> > > Wouldn't a clean solution be to promote a task's scheduler
> > > class to t
On Fri, 2012-10-26 at 16:57 +0300, Kirill A. Shutemov wrote:
> > > Yes, this code will catch it:
> > >
> > > /* if an huge pmd materialized from under us just retry later */
> > > if (unlikely(pmd_trans_huge(*pmd)))
> > > return 0;
> > >
> > > If the pmd is under splitting it'
On Fri, 2012-10-26 at 15:50 +0200, Ingo Molnar wrote:
>
> Oh, just found the reason:
>
> the ptep_modify_prot_start()/modify()/commit() sequence is
> SMP-unsafe - it has to be done under the mmap_sem write-locked.
>
> It is safe against *hardware* updates to the PTE, but not safe
> against its
On Mon, 2012-10-29 at 19:02 +0800, Chuansheng Liu wrote:
> +/*
> + * dump_hrtimer_callinfo - print hrtimer information including:
> + * state, callback function, pid and start_site.
> +*/
> +static void dump_hrtimer_callinfo(struct hrtimer *timer)
> +{
> +
> + char symname[KSYM_NAME_LEN];
> +
On Sun, 2012-10-28 at 20:12 +0100, Andi Kleen wrote:
>
> Note I wrote and posted all this before you posted last week, but the wheels
> of perf review grind so slowly that you overtook me.
>
> Peter Z., to be honest all these later patches are just caused by not having
> generic TSX events/modifi
On Mon, 2012-10-29 at 19:08 +0900, Namhyung Kim wrote:
> That means it can support precise == 3?
It should, the difference between 2 and 3 is allowing for !EXACT_IP
samples. Not needing the LBR based fixup we should never have that, so
HSW might indeed allow for 3.
--
To unsubscribe from this list
On Mon, 2012-10-29 at 16:15 +0100, Stephane Eranian wrote:
> +EVENT_ATTR_STR(mem-loads, mem_ld_nhm, "event=0x100b,umask=0x1,ldlat=3");
> +EVENT_ATTR_STR(mem-loads, mem_ld_snb, "event=0xcd,umask=0x1,ldlat=3");
I haven't fully grokked the macro magic yet, but event=0x100b seems
wrong, event only ta
On Mon, 2012-10-29 at 16:15 +0100, Stephane Eranian wrote:
> +static u64 load_latency_data(u64 status)
> +{
> + union intel_x86_pebs_dse dse;
> + u64 val;
> + int model = boot_cpu_data.x86_model;
> + int fam = boot_cpu_data.x86;
> +
> + dse.val = status;
> +
> +
On Mon, 2012-10-29 at 16:15 +0100, Stephane Eranian wrote:
> + fll = event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT;
> +
> perf_sample_data_init(&data, 0, event->hw.last_period);
>
> + data.period = event->hw.last_period;
> + sample_type = event->attr.sample_type;
> +
> +
On Mon, 2012-10-29 at 16:15 +0100, Stephane Eranian wrote:
> - if (fll) {
> + if (fll || fst) {
> if (sample_type & PERF_SAMPLE_ADDR)
> data.addr = pebs->dla;
>
> @@ -688,6 +731,8 @@ static void __intel_pmu_pebs_event(struct perf_event
> *event
On Mon, 2012-10-29 at 16:43 +0100, Stephane Eranian wrote:
> You meant fll, instead I think.
Oh, yes, too small font I guess.
> Well, that would work too, but I am trying to factorize the code
> with Precise Store which is a later patch.
Yeah, just found that, its fine the way it is. Just looke
On Mon, 2012-10-29 at 12:38 -0400, Vivek Goyal wrote:
> Ok, so the question is what's wrong with calling synchronize_rcu() inside
> a mutex with CONFIG_PREEMPT=y. I don't know. Ccing paul mckenney and
> peterz.
int blkcg_activate_policy(struct request_queue *q,
{
...
preloaded
On Mon, 2012-10-29 at 19:37 +0530, Raghavendra K T wrote:
> +/*
> + * A load of 2048 corresponds to 1:1 overcommit
> + * undercommit threshold is half the 1:1 overcommit
> + * overcommit threshold is 1.75 times of 1:1 overcommit threshold
> + */
> +#define COMMIT_THRESHOLD (FIXED_1)
> +#define UNDE
301 - 400 of 24251 matches
Mail list logo