[RFC PATCH] sched: select_idle_core should select least utilized core

2017-06-08 Thread Subhra Mazumdar
hyperthreads and return an idle cpu in that core. Signed-off-by: Subhra Mazumdar <subhra.mazum...@oracle.com> --- kernel/sched/fair.c | 113 +- kernel/sched/idle_task.c |1 - kernel/sched/sched.h | 10 3 files changed, 21 inse

Re: [RFC PATCH] sched: select_idle_core should select least utilized core

2017-06-08 Thread subhra mazumdar
On 06/08/2017 12:59 PM, Peter Zijlstra wrote: On Thu, Jun 08, 2017 at 03:26:32PM -0400, Subhra Mazumdar wrote: Current select_idle_core tries to find a fully idle core and if it fails select_idle_cpu next returns any idle cpu in the llc domain. This is not optimal for architectures with many

[RFC PATCH] sched: Improve scalability of select_idle_sibling using SMT balance

2017-12-08 Thread subhra mazumdar
baseline-rc6 %stdev patch %stdev context_switch() 663.8799 4.46687.4068 (+3.54%) 2.85 select_idle_sibling()0.556 1.720.263 (-52.70%) 0.78 Signed-off-by: subhra mazumdar <subhra.mazum...@oracle.com> --- include/linux/sched/topology.h

Re: [PATCH 1/3] sched: remove select_idle_core() for scalability

2018-05-04 Thread Subhra Mazumdar
On 05/02/2018 02:58 PM, Subhra Mazumdar wrote: On 05/01/2018 11:03 AM, Peter Zijlstra wrote: On Mon, Apr 30, 2018 at 04:38:42PM -0700, Subhra Mazumdar wrote: I also noticed a possible bug later in the merge code. Shouldn't it be: if (busy < best_busy) { best_busy = b

Re: [PATCH 1/5] sched: limit cpu search in select_idle_cpu

2018-06-12 Thread Subhra Mazumdar
improve the system] url: https://github.com/0day-ci/linux/commits/subhra-mazumdar/Improve-scheduler-scalability-for-fast-path/20180613-015158 config: i386-randconfig-x070-201823 (attached as .config) compiler: gcc-7 (Debian 7.3.0-16) 7.3.0 reproduce: # save the attached .config to linux

[PATCH 5/5] sched: SIS_CORE to disable idle core search

2018-06-12 Thread subhra mazumdar
Use SIS_CORE to disable idle core search. For some workloads select_idle_core becomes a scalability bottleneck, removing it improves throughput. Also there are workloads where disabling it can hurt latency, so need to have an option. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 8

[PATCH 1/5] sched: limit cpu search in select_idle_cpu

2018-06-12 Thread subhra mazumdar
-by: subhra mazumdar --- kernel/sched/fair.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e497c05..9a6d28d 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6372,7 +6372,7 @@ static int select_idle_cpu

[PATCH 2/5] sched: introduce per-cpu var next_cpu to track search limit

2018-06-12 Thread subhra mazumdar
Introduce a per-cpu variable to track the limit upto which idle cpu search was done in select_idle_cpu(). This will help to start the search next time from there. This is necessary for rotating the search window over entire LLC domain. Signed-off-by: subhra mazumdar --- kernel/sched/core.c | 2

[PATCH 3/5] sched: rotate the cpu search window for better spread

2018-06-12 Thread subhra mazumdar
Rotate the cpu search window for better spread of threads. This will ensure an idle cpu will quickly be found if one exists. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched

[PATCH 4/5] sched: add sched feature to disable idle core search

2018-06-12 Thread subhra mazumdar
Add a new sched feature SIS_CORE to have an option to disable idle core search (select_idle_core). Signed-off-by: subhra mazumdar --- kernel/sched/features.h | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 85ae848..de15733 100644

[RFC/RFT V2 PATCH 0/5] Improve scheduler scalability for fast path

2018-06-12 Thread subhra mazumdar
- Compute the upper and lower limit based on number of cpus in a core - Split up the search limit and search window rotation into separate patches - Add new sched feature to have option of disabling idle core search subhra mazumdar (5): sched: limit cpu search in select_idle_cpu sched

Re: [PATCH 1/3] sched: remove select_idle_core() for scalability

2018-05-30 Thread Subhra Mazumdar
On 05/29/2018 02:36 PM, Peter Zijlstra wrote: On Wed, May 02, 2018 at 02:58:42PM -0700, Subhra Mazumdar wrote: I re-ran the test after fixing that bug but still get similar regressions for hackbench Hackbench process on 2 socket, 44 core and 88 threads Intel x86 machine (lower is better

Re: [PATCH 1/3] sched: remove select_idle_core() for scalability

2018-05-02 Thread Subhra Mazumdar
On 05/01/2018 11:03 AM, Peter Zijlstra wrote: On Mon, Apr 30, 2018 at 04:38:42PM -0700, Subhra Mazumdar wrote: I also noticed a possible bug later in the merge code. Shouldn't it be: if (busy < best_busy) {     best_busy = busy;     best_cpu = first_idle; } Uhh, quite. I did

Re: [PATCH 1/3] sched: remove select_idle_core() for scalability

2018-04-30 Thread Subhra Mazumdar
On 04/25/2018 10:49 AM, Peter Zijlstra wrote: On Tue, Apr 24, 2018 at 02:45:50PM -0700, Subhra Mazumdar wrote: So what you said makes sense in theory but is not borne out by real world results. This indicates that threads of these benchmarks care more about running immediately on any idle cpu

[RFC PATCH V2] sched: Improve scalability of select_idle_sibling using SMT balance

2018-01-08 Thread subhra mazumdar
(-52.70%) 0.78 Signed-off-by: subhra mazumdar <subhra.mazum...@oracle.com> --- include/linux/sched/topology.h | 2 + kernel/sched/core.c| 38 +++ kernel/sched/fair.c| 245 - kernel/sched/idle_task.c | 1 -

Re: [RFC PATCH V2] sched: Improve scalability of select_idle_sibling using SMT balance

2018-01-10 Thread Subhra Mazumdar
On 01/09/2018 06:50 AM, Steven Sistare wrote: On 1/8/2018 5:18 PM, Peter Zijlstra wrote: On Mon, Jan 08, 2018 at 02:12:37PM -0800, subhra mazumdar wrote: @@ -2751,6 +2763,31 @@ context_switch(struct rq *rq, struct task_struct *prev, struct task_struct *next, struct rq_flags

[RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-01-12 Thread subhra mazumdar
%stdev select_idle_sibling()0.556 1.720.263 (-52.70%) 0.78 Signed-off-by: subhra mazumdar <subhra.mazum...@oracle.com> --- include/linux/sched/topology.h | 2 + kernel/sched/core.c| 43 +++ kernel/sched/fair.c

Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-02-02 Thread Subhra Mazumdar
On 2/2/18 9:17 AM, Peter Zijlstra wrote: On Fri, Feb 02, 2018 at 11:53:40AM -0500, Steven Sistare wrote: +static int select_idle_smt(struct task_struct *p, struct sched_group *sg) { + int i, rand_index, rand_cpu; + int this_cpu = smp_processor_id(); + rand_index =

[RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-01-29 Thread subhra mazumdar
%stdev select_idle_sibling()0.556 1.720.263 (-52.70%) 0.78 Signed-off-by: subhra mazumdar <subhra.mazum...@oracle.com> --- include/linux/sched/topology.h | 2 + kernel/sched/core.c| 43 +++ kernel/sched/fair.c

Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-02-07 Thread Subhra Mazumdar
On 02/07/2018 12:42 AM, Peter Zijlstra wrote: On Tue, Feb 06, 2018 at 04:30:03PM -0800, Subhra Mazumdar wrote: I meant the SMT balance patch. That does comparison with only one other random core and takes the decision in O(1). Any potential scan of all cores or cpus is O(n) and doesn't scale

Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-02-06 Thread Subhra Mazumdar
On 02/06/2018 01:12 AM, Peter Zijlstra wrote: On Mon, Feb 05, 2018 at 02:09:11PM -0800, Subhra Mazumdar wrote: The pseudo random is also used for choosing a random core to compare with, how will transposing achieve that? Not entirely sure what your point is. Current code doesn't compare

Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-02-05 Thread Subhra Mazumdar
On 02/05/2018 09:03 AM, Peter Zijlstra wrote: On Mon, Feb 05, 2018 at 01:48:54PM +0100, Peter Zijlstra wrote: So while I see the point of tracking these numbers (for SMT>2), I don't think its worth doing outside of the core, and then we still need some powerpc (or any other architecture with

Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-02-05 Thread Subhra Mazumdar
On 02/05/2018 04:19 AM, Peter Zijlstra wrote: On Fri, Feb 02, 2018 at 09:37:02AM -0800, Subhra Mazumdar wrote: In the scheme of SMT balance, if the idle cpu search is done _not_ in the last run core, then we need a random cpu to start from. If the idle cpu search is done in the last run core

Re: [RFC PATCH] sched: Improve scalability of select_idle_sibling using SMT balance

2017-12-20 Thread Subhra Mazumdar
On 12/19/2017 11:36 AM, Peter Zijlstra wrote: On Fri, Dec 08, 2017 at 12:07:54PM -0800, subhra mazumdar wrote: +static inline void +sd_context_switch(struct sched_domain *sd, struct rq *rq, int util) +{ + struct sched_group *sg_cpu; + + /* atomically add/subtract the util

[RFC PATCH 2/2] pipe: use pipe busy wait

2018-08-30 Thread subhra mazumdar
. The value of it specifies the amount of spin in microseconds. Signed-off-by: subhra mazumdar --- fs/pipe.c | 58 +-- include/linux/pipe_fs_i.h | 1 + kernel/sysctl.c | 7 ++ 3 files changed, 64 insertions(+), 2 deletions

[RFC PATCH 0/2] Pipe busy wait

2018-08-30 Thread subhra mazumdar
%) subhra mazumdar (2): pipe: introduce busy wait for pipe pipe: use pipe busy wait fs/pipe.c | 58 +-- include/linux/pipe_fs_i.h | 20 kernel/sysctl.c | 7 ++ 3 files changed, 83 insertions(+), 2

[RFC PATCH 1/2] pipe: introduce busy wait for pipe

2018-08-30 Thread subhra mazumdar
. Other similar usecases can benefit. pipe_wait_flag is used to signal any thread busy waiting. pipe_busy_loop_timeout checks if spin time is over. Signed-off-by: subhra mazumdar --- include/linux/pipe_fs_i.h | 19 +++ 1 file changed, 19 insertions(+) diff --git a/include/linux

Re: [RFC PATCH 0/2] Pipe busy wait

2018-08-30 Thread Subhra Mazumdar
On 08/30/2018 01:24 PM, subhra mazumdar wrote: This patch introduces busy waiting for pipes similar to network sockets. When pipe is full or empty a thread busy waits for some microseconds before sleeping. This avoids the sleep and wakeup overhead and improves performance in case wakeup

Re: [RFC PATCH 2/2] pipe: use pipe busy wait

2018-09-04 Thread Subhra Mazumdar
On 09/04/2018 02:54 PM, Thomas Gleixner wrote: On Thu, 30 Aug 2018, subhra mazumdar wrote: +void pipe_busy_wait(struct pipe_inode_info *pipe) +{ + unsigned long wait_flag = pipe->pipe_wait_flag; + unsigned long start_time = pipe_busy_loop_current_time(); + + pipe_unl

Re: [RFC PATCH 1/2] pipe: introduce busy wait for pipe

2018-09-04 Thread Subhra Mazumdar
On 08/31/2018 09:09 AM, Steven Sistare wrote: On 8/30/2018 4:24 PM, subhra mazumdar wrote: Introduce pipe_ll_usec field for pipes that indicates the amount of micro seconds a thread should spin if pipe is empty or full before sleeping. This is similar to network sockets. Workloads like

Re: [PATCH 1/3] sched: remove select_idle_core() for scalability

2018-04-24 Thread Subhra Mazumdar
On 04/24/2018 05:46 AM, Peter Zijlstra wrote: On Mon, Apr 23, 2018 at 05:41:14PM -0700, subhra mazumdar wrote: select_idle_core() can potentially search all cpus to find the fully idle core even if there is one such core. Removing this is necessary to achieve scalability in the fast path. So

Re: [PATCH 3/3] sched: limit cpu search and rotate search window for scalability

2018-04-24 Thread Subhra Mazumdar
On 04/24/2018 05:48 AM, Peter Zijlstra wrote: On Mon, Apr 23, 2018 at 05:41:16PM -0700, subhra mazumdar wrote: + if (per_cpu(next_cpu, target) != -1) + target_tmp = per_cpu(next_cpu, target); + else + target_tmp = target; + This one; what's the point

Re: [PATCH 3/3] sched: limit cpu search and rotate search window for scalability

2018-04-24 Thread Subhra Mazumdar
On 04/24/2018 05:53 AM, Peter Zijlstra wrote: On Mon, Apr 23, 2018 at 05:41:16PM -0700, subhra mazumdar wrote: Lower the lower limit of idle cpu search in select_idle_cpu() and also put an upper limit. This helps in scalability of the search by restricting the search window. @@ -6297,15

Re: [PATCH 2/3] sched: introduce per-cpu var next_cpu to track search limit

2018-04-24 Thread Subhra Mazumdar
On 04/24/2018 05:47 AM, Peter Zijlstra wrote: On Mon, Apr 23, 2018 at 05:41:15PM -0700, subhra mazumdar wrote: @@ -17,6 +17,7 @@ #include DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues); +DEFINE_PER_CPU_SHARED_ALIGNED(int, next_cpu); #if defined(CONFIG_SCHED_DEBUG

Re: [PATCH 3/3] sched: limit cpu search and rotate search window for scalability

2018-04-24 Thread Subhra Mazumdar
On 04/24/2018 05:48 AM, Peter Zijlstra wrote: On Mon, Apr 23, 2018 at 05:41:16PM -0700, subhra mazumdar wrote: Lower the lower limit of idle cpu search in select_idle_cpu() and also put an upper limit. This helps in scalability of the search by restricting the search window. Also rotating

[PATCH 1/3] sched: remove select_idle_core() for scalability

2018-04-23 Thread subhra mazumdar
select_idle_core() can potentially search all cpus to find the fully idle core even if there is one such core. Removing this is necessary to achieve scalability in the fast path. Signed-off-by: subhra mazumdar <subhra.mazum...@oracle.com> --- include/linux/sched/topology.h | 1 - kernel

[PATCH 2/3] sched: introduce per-cpu var next_cpu to track search limit

2018-04-23 Thread subhra mazumdar
Introduce a per-cpu variable to track the limit upto which idle cpu search was done in select_idle_cpu(). This will help to start the search next time from there. This is necessary for rotating the search window over entire LLC domain. Signed-off-by: subhra mazumdar <subhra.mazum...@oracle.

[RFC/RFT PATCH 0/3] Improve scheduler scalability for fast path

2018-04-23 Thread subhra mazumdar
Intel x86 machine with no statistically significant regressions while giving improvements in some cases. I am not listing the results due to too many data points. subhra mazumdar (3): sched: remove select_idle_core() for scalability sched: introduce per-cpu var next_cpu to track search limit

[PATCH 3/3] sched: limit cpu search and rotate search window for scalability

2018-04-23 Thread subhra mazumdar
-by: subhra mazumdar <subhra.mazum...@oracle.com> --- kernel/sched/fair.c | 19 ++- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d1d4769..62d585b 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -

[PATCH 4/5] sched: add sched feature to disable idle core search

2018-06-28 Thread subhra mazumdar
Add a new sched feature SIS_CORE to have an option to disable idle core search (select_idle_core). Signed-off-by: subhra mazumdar --- kernel/sched/features.h | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 85ae848..de15733 100644

[PATCH 2/5] sched: introduce per-cpu var next_cpu to track search limit

2018-06-28 Thread subhra mazumdar
Introduce a per-cpu variable to track the limit upto which idle cpu search was done in select_idle_cpu(). This will help to start the search next time from there. This is necessary for rotating the search window over entire LLC domain. Signed-off-by: subhra mazumdar --- kernel/sched/core.c | 2

[PATCH 5/5] sched: SIS_CORE to disable idle core search

2018-06-28 Thread subhra mazumdar
Use SIS_CORE to disable idle core search. For some workloads select_idle_core becomes a scalability bottleneck, removing it improves throughput. Also there are workloads where disabling it can hurt latency, so need to have an option. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 8

[PATCH 1/5] sched: limit cpu search in select_idle_cpu

2018-06-28 Thread subhra mazumdar
-by: subhra mazumdar --- kernel/sched/fair.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e497c05..7243146 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6372,7 +6372,7 @@ static int select_idle_cpu

[PATCH 3/5] sched: rotate the cpu search window for better spread

2018-06-28 Thread subhra mazumdar
Rotate the cpu search window for better spread of threads. This will ensure an idle cpu will quickly be found if one exists. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched

[RESEND RFC/RFT V2 PATCH 0/5] Improve scheduler scalability for fast path

2018-06-28 Thread subhra mazumdar
- Compute the upper and lower limit based on number of cpus in a core - Split up the search limit and search window rotation into separate patches - Add new sched feature to have option of disabling idle core search subhra mazumdar (5): sched: limit cpu search in select_idle_cpu sched

Re: Gang scheduling

2018-10-15 Thread Subhra Mazumdar
On 10/12/2018 11:01 AM, Tim Chen wrote: On 10/10/2018 05:09 PM, Subhra Mazumdar wrote: Hi, I was following the Coscheduling patch discussion on lkml and Peter mentioned he had a patch series. I found the following on github. https://github.com/pdxChen/gang/commits/sched_1.23-loadbal I

Re: [RFC 00/60] Coscheduling for Linux

2018-10-18 Thread Subhra Mazumdar
Hi Jan, On 9/7/18 2:39 PM, Jan H. Schönherr wrote: The collective context switch from one coscheduled set of tasks to another -- while fast -- is not atomic. If a use-case needs the absolute guarantee that all tasks of the previous set have stopped executing before any task of the next set

Re: [RFC 00/60] Coscheduling for Linux

2018-10-29 Thread Subhra Mazumdar
On 10/26/18 4:44 PM, Jan H. Schönherr wrote: On 19/10/2018 02.26, Subhra Mazumdar wrote: Hi Jan, Hi. Sorry for the delay. On 9/7/18 2:39 PM, Jan H. Schönherr wrote: The collective context switch from one coscheduled set of tasks to another -- while fast -- is not atomic. If a use-case

Re: [RFC 00/60] Coscheduling for Linux

2018-10-26 Thread Subhra Mazumdar
D) What can I *not* do with this? - Besides the missing load-balancing within coscheduled task-groups, this implementation has the following properties, which might be considered short-comings. This particular implementation focuses on SCHED_OTHER tasks

Re: [PATCH 00/10] steal tasks to improve CPU utilization

2018-11-02 Thread Subhra Mazumdar
On 10/22/18 7:59 AM, Steve Sistare wrote: When a CPU has no more CFS tasks to run, and idle_balance() fails to find a task, then attempt to steal a task from an overloaded CPU in the same LLC. Maintain and use a bitmap of overloaded CPUs to efficiently identify candidates. To minimize search

Re: [RFC PATCH v2 1/1] pipe: busy wait for pipe

2018-11-05 Thread Subhra Mazumdar
On 11/5/18 2:08 AM, Mel Gorman wrote: Adding Al Viro as per get_maintainers.pl. On Tue, Sep 25, 2018 at 04:32:40PM -0700, subhra mazumdar wrote: Introduce pipe_ll_usec field for pipes that indicates the amount of micro seconds a thread should spin if pipe is empty or full before sleeping

Gang scheduling

2018-10-10 Thread Subhra Mazumdar
Hi, I was following the Coscheduling patch discussion on lkml and Peter mentioned he had a patch series. I found the following on github. https://github.com/pdxChen/gang/commits/sched_1.23-loadbal I would like to test this with KVMs. Are the commits from 38d5acb to f019876 sufficient? Also

Re: [RFC 00/60] Coscheduling for Linux

2018-09-19 Thread Subhra Mazumdar
On 09/18/2018 04:44 AM, Jan H. Schönherr wrote: On 09/18/2018 02:33 AM, Subhra Mazumdar wrote: On 09/07/2018 02:39 PM, Jan H. Schönherr wrote: A) Quickstart guide for the impatient. -- Here is a quickstart guide to set up coscheduling at core-level

[RFC PATCH v2 0/1] Pipe busy wait

2018-09-25 Thread subhra mazumdar
ead -Added usr+sys time for hackbench runs as shown by time command subhra mazumdar (1): pipe: busy wait for pipe fs/pipe.c | 12 include/linux/pipe_fs_i.h | 2 ++ kernel/sysctl.c | 7 +++ 3 files changed, 21 insertions(+) -- 2.9.3

[RFC PATCH v2 1/1] pipe: busy wait for pipe

2018-09-25 Thread subhra mazumdar
. Other similar usecases can benefit. A tunable pipe_busy_poll is introduced to enable or disable busy waiting via /proc. The value of it specifies the amount of spin in microseconds. Default value is 0 indicating no spin. Signed-off-by: subhra mazumdar --- fs/pipe.c | 12

Re: [RFC 00/60] Coscheduling for Linux

2018-09-27 Thread Subhra Mazumdar
On 09/24/2018 08:43 AM, Jan H. Schönherr wrote: On 09/19/2018 11:53 PM, Subhra Mazumdar wrote: Can we have a more generic interface, like specifying a set of task ids to be co-scheduled with a particular level rather than tying this with cgroups? KVMs may not always run with cgroups

Re: [RFC 00/60] Coscheduling for Linux

2018-09-27 Thread Subhra Mazumdar
On 09/26/2018 02:58 AM, Jan H. Schönherr wrote: On 09/17/2018 02:25 PM, Peter Zijlstra wrote: On Fri, Sep 14, 2018 at 06:25:44PM +0200, Jan H. Schönherr wrote: Assuming, there is a cgroup-less solution that can prevent simultaneous execution of tasks on a core, when they're not supposed

Re: [RFC PATCH 2/2] pipe: use pipe busy wait

2018-09-17 Thread Subhra Mazumdar
On 09/07/2018 05:25 AM, Peter Zijlstra wrote: On Thu, Aug 30, 2018 at 01:24:58PM -0700, subhra mazumdar wrote: +void pipe_busy_wait(struct pipe_inode_info *pipe) +{ + unsigned long wait_flag = pipe->pipe_wait_flag; + unsigned long start_time = pipe_busy_loop_current_t

Re: [RFC 00/60] Coscheduling for Linux

2018-09-17 Thread Subhra Mazumdar
On 09/07/2018 02:39 PM, Jan H. Schönherr wrote: This patch series extends CFS with support for coscheduling. The implementation is versatile enough to cover many different coscheduling use-cases, while at the same time being non-intrusive, so that behavior of legacy workloads does not change.

Re: [RFC PATCH 2/2] pipe: use pipe busy wait

2018-09-17 Thread Subhra Mazumdar
On 09/17/2018 03:43 PM, Peter Zijlstra wrote: On Mon, Sep 17, 2018 at 02:05:40PM -0700, Subhra Mazumdar wrote: On 09/07/2018 05:25 AM, Peter Zijlstra wrote: Why not just busy wait on current->state ? A little something like: diff --git a/fs/pipe.c b/fs/pipe.c index bdc5d3c09

Re: [PATCH 1/3] sched: remove select_idle_core() for scalability

2018-04-30 Thread Subhra Mazumdar
On 04/25/2018 10:49 AM, Peter Zijlstra wrote: On Tue, Apr 24, 2018 at 02:45:50PM -0700, Subhra Mazumdar wrote: So what you said makes sense in theory but is not borne out by real world results. This indicates that threads of these benchmarks care more about running immediately on any idle cpu

Re: [PATCH 1/3] sched: remove select_idle_core() for scalability

2018-05-04 Thread Subhra Mazumdar
On 05/02/2018 02:58 PM, Subhra Mazumdar wrote: On 05/01/2018 11:03 AM, Peter Zijlstra wrote: On Mon, Apr 30, 2018 at 04:38:42PM -0700, Subhra Mazumdar wrote: I also noticed a possible bug later in the merge code. Shouldn't it be: if (busy < best_busy) { best_busy = b

Re: [PATCH 1/3] sched: remove select_idle_core() for scalability

2018-05-02 Thread Subhra Mazumdar
On 05/01/2018 11:03 AM, Peter Zijlstra wrote: On Mon, Apr 30, 2018 at 04:38:42PM -0700, Subhra Mazumdar wrote: I also noticed a possible bug later in the merge code. Shouldn't it be: if (busy < best_busy) {     best_busy = busy;     best_cpu = first_idle; } Uhh, quite. I did

[RFC PATCH V2] sched: Improve scalability of select_idle_sibling using SMT balance

2018-01-08 Thread subhra mazumdar
(-52.70%) 0.78 Signed-off-by: subhra mazumdar --- include/linux/sched/topology.h | 2 + kernel/sched/core.c| 38 +++ kernel/sched/fair.c| 245 - kernel/sched/idle_task.c | 1 - kernel/sched/sched.h | 26

Re: [RFC PATCH V2] sched: Improve scalability of select_idle_sibling using SMT balance

2018-01-10 Thread Subhra Mazumdar
On 01/09/2018 06:50 AM, Steven Sistare wrote: On 1/8/2018 5:18 PM, Peter Zijlstra wrote: On Mon, Jan 08, 2018 at 02:12:37PM -0800, subhra mazumdar wrote: @@ -2751,6 +2763,31 @@ context_switch(struct rq *rq, struct task_struct *prev, struct task_struct *next, struct rq_flags

Re: [RFC PATCH] sched: Improve scalability of select_idle_sibling using SMT balance

2017-12-20 Thread Subhra Mazumdar
On 12/19/2017 11:36 AM, Peter Zijlstra wrote: On Fri, Dec 08, 2017 at 12:07:54PM -0800, subhra mazumdar wrote: +static inline void +sd_context_switch(struct sched_domain *sd, struct rq *rq, int util) +{ + struct sched_group *sg_cpu; + + /* atomically add/subtract the util

[RFC/RFT PATCH 0/3] Improve scheduler scalability for fast path

2018-04-23 Thread subhra mazumdar
Intel x86 machine with no statistically significant regressions while giving improvements in some cases. I am not listing the results due to too many data points. subhra mazumdar (3): sched: remove select_idle_core() for scalability sched: introduce per-cpu var next_cpu to track search limit

[PATCH 1/3] sched: remove select_idle_core() for scalability

2018-04-23 Thread subhra mazumdar
select_idle_core() can potentially search all cpus to find the fully idle core even if there is one such core. Removing this is necessary to achieve scalability in the fast path. Signed-off-by: subhra mazumdar --- include/linux/sched/topology.h | 1 - kernel/sched/fair.c| 97

[PATCH 2/3] sched: introduce per-cpu var next_cpu to track search limit

2018-04-23 Thread subhra mazumdar
Introduce a per-cpu variable to track the limit upto which idle cpu search was done in select_idle_cpu(). This will help to start the search next time from there. This is necessary for rotating the search window over entire LLC domain. Signed-off-by: subhra mazumdar --- kernel/sched/core.c | 2

[PATCH 3/3] sched: limit cpu search and rotate search window for scalability

2018-04-23 Thread subhra mazumdar
-by: subhra mazumdar --- kernel/sched/fair.c | 19 ++- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d1d4769..62d585b 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6279,7 +6279,7 @@ static int

Re: [PATCH 1/3] sched: remove select_idle_core() for scalability

2018-04-24 Thread Subhra Mazumdar
On 04/24/2018 05:46 AM, Peter Zijlstra wrote: On Mon, Apr 23, 2018 at 05:41:14PM -0700, subhra mazumdar wrote: select_idle_core() can potentially search all cpus to find the fully idle core even if there is one such core. Removing this is necessary to achieve scalability in the fast path. So

Re: [PATCH 2/3] sched: introduce per-cpu var next_cpu to track search limit

2018-04-24 Thread Subhra Mazumdar
On 04/24/2018 05:47 AM, Peter Zijlstra wrote: On Mon, Apr 23, 2018 at 05:41:15PM -0700, subhra mazumdar wrote: @@ -17,6 +17,7 @@ #include DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues); +DEFINE_PER_CPU_SHARED_ALIGNED(int, next_cpu); #if defined(CONFIG_SCHED_DEBUG

Re: [PATCH 3/3] sched: limit cpu search and rotate search window for scalability

2018-04-24 Thread Subhra Mazumdar
On 04/24/2018 05:48 AM, Peter Zijlstra wrote: On Mon, Apr 23, 2018 at 05:41:16PM -0700, subhra mazumdar wrote: + if (per_cpu(next_cpu, target) != -1) + target_tmp = per_cpu(next_cpu, target); + else + target_tmp = target; + This one; what's the point

Re: [PATCH 3/3] sched: limit cpu search and rotate search window for scalability

2018-04-24 Thread Subhra Mazumdar
On 04/24/2018 05:48 AM, Peter Zijlstra wrote: On Mon, Apr 23, 2018 at 05:41:16PM -0700, subhra mazumdar wrote: Lower the lower limit of idle cpu search in select_idle_cpu() and also put an upper limit. This helps in scalability of the search by restricting the search window. Also rotating

Re: [PATCH 3/3] sched: limit cpu search and rotate search window for scalability

2018-04-24 Thread Subhra Mazumdar
On 04/24/2018 05:53 AM, Peter Zijlstra wrote: On Mon, Apr 23, 2018 at 05:41:16PM -0700, subhra mazumdar wrote: Lower the lower limit of idle cpu search in select_idle_cpu() and also put an upper limit. This helps in scalability of the search by restricting the search window. @@ -6297,15

Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-02-02 Thread Subhra Mazumdar
On 2/2/18 9:17 AM, Peter Zijlstra wrote: On Fri, Feb 02, 2018 at 11:53:40AM -0500, Steven Sistare wrote: +static int select_idle_smt(struct task_struct *p, struct sched_group *sg) { + int i, rand_index, rand_cpu; + int this_cpu = smp_processor_id(); + rand_index =

[RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-01-12 Thread subhra mazumdar
%stdev select_idle_sibling()0.556 1.720.263 (-52.70%) 0.78 Signed-off-by: subhra mazumdar --- include/linux/sched/topology.h | 2 + kernel/sched/core.c| 43 +++ kernel/sched/fair.c| 251

Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-02-05 Thread Subhra Mazumdar
On 02/05/2018 04:19 AM, Peter Zijlstra wrote: On Fri, Feb 02, 2018 at 09:37:02AM -0800, Subhra Mazumdar wrote: In the scheme of SMT balance, if the idle cpu search is done _not_ in the last run core, then we need a random cpu to start from. If the idle cpu search is done in the last run core

Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-02-05 Thread Subhra Mazumdar
On 02/05/2018 09:03 AM, Peter Zijlstra wrote: On Mon, Feb 05, 2018 at 01:48:54PM +0100, Peter Zijlstra wrote: So while I see the point of tracking these numbers (for SMT>2), I don't think its worth doing outside of the core, and then we still need some powerpc (or any other architecture with

Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-02-06 Thread Subhra Mazumdar
On 02/06/2018 01:12 AM, Peter Zijlstra wrote: On Mon, Feb 05, 2018 at 02:09:11PM -0800, Subhra Mazumdar wrote: The pseudo random is also used for choosing a random core to compare with, how will transposing achieve that? Not entirely sure what your point is. Current code doesn't compare

[RFC PATCH] sched: Improve scalability of select_idle_sibling using SMT balance

2017-12-08 Thread subhra mazumdar
baseline-rc6 %stdev patch %stdev context_switch() 663.8799 4.46687.4068 (+3.54%) 2.85 select_idle_sibling()0.556 1.720.263 (-52.70%) 0.78 Signed-off-by: subhra mazumdar --- include/linux/sched/topology.h | 4 + kernel/sched/core.c

Re: [RFC 0/2] Optimize the idle CPU search

2019-07-08 Thread Subhra Mazumdar
On 7/8/19 10:24 AM, Parth Shah wrote: When searching for an idle_sibling, scheduler first iterates to search for an idle core and then for an idle CPU. By maintaining the idle CPU mask while iterating through idle cores, we can mark non-idle CPUs for which idle CPU search would not have to

Re: [RFC 0/2] Optimize the idle CPU search

2019-07-09 Thread Subhra Mazumdar
On 7/8/19 1:38 PM, Peter Zijlstra wrote: On Mon, Jul 08, 2019 at 10:24:30AM +0530, Parth Shah wrote: When searching for an idle_sibling, scheduler first iterates to search for an idle core and then for an idle CPU. By maintaining the idle CPU mask while iterating through idle cores, we can

Re: [RFC 0/2] Optimize the idle CPU search

2019-07-09 Thread Subhra Mazumdar
On 7/9/19 11:08 AM, Parth Shah wrote: On 7/9/19 5:38 AM, Subhra Mazumdar wrote: On 7/8/19 10:24 AM, Parth Shah wrote: When searching for an idle_sibling, scheduler first iterates to search for an idle core and then for an idle CPU. By maintaining the idle CPU mask while iterating through

Panic on v5.3-rc4

2019-08-15 Thread Subhra Mazumdar
I am getting the following panic during boot of tag v5.3-rc4 of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git. I don't see the panic on tag v5.2 on same rig. Is it a bug or something legitimately changed? Thanks, Subhra [  147.184948] dracut Warning: No root device

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-06-13 Thread Subhra Mazumdar
On 6/12/19 9:33 AM, Julien Desfossez wrote: After reading more traces and trying to understand why only untagged tasks are starving when there are cpu-intensive tasks running on the same set of CPUs, we noticed a difference in behavior in ‘pick_task’. In the case where ‘core_cookie’ is 0, we

Re: [RFC PATCH 2/3] sched: change scheduler to give preference to soft affinity CPUs

2019-07-16 Thread Subhra Mazumdar
On 7/2/19 10:58 PM, Peter Zijlstra wrote: On Wed, Jun 26, 2019 at 03:47:17PM -0700, subhra mazumdar wrote: The soft affinity CPUs present in the cpumask cpus_preferred is used by the scheduler in two levels of search. First is in determining wake affine which choses the LLC domain

[RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-01-29 Thread subhra mazumdar
%stdev select_idle_sibling()0.556 1.720.263 (-52.70%) 0.78 Signed-off-by: subhra mazumdar --- include/linux/sched/topology.h | 2 + kernel/sched/core.c| 43 +++ kernel/sched/fair.c| 247

Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-02-07 Thread Subhra Mazumdar
On 02/07/2018 12:42 AM, Peter Zijlstra wrote: On Tue, Feb 06, 2018 at 04:30:03PM -0800, Subhra Mazumdar wrote: I meant the SMT balance patch. That does comparison with only one other random core and takes the decision in O(1). Any potential scan of all cores or cpus is O(n) and doesn't scale

Re: [PATCH v3 5/7] sched: SIS_CORE to disable idle core search

2019-07-13 Thread Subhra Mazumdar
On 7/4/19 6:04 PM, Parth Shah wrote: Same experiment with hackbench and with perf analysis shows increase in L1 cache miss rate with these patches (Lower is better) Baseline(%) Patch(%) --- - --- Total Cache miss rate

[PATCH v3 5/7] sched: SIS_CORE to disable idle core search

2019-06-08 Thread subhra mazumdar
Use SIS_CORE to disable idle core search. For some workloads select_idle_core becomes a scalability bottleneck, removing it improves throughput. Also there are workloads where disabling it can hurt latency, so need to have an option. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 8

[PATCH v3 1/7] sched: limit cpu search in select_idle_cpu

2019-06-08 Thread subhra mazumdar
-by: subhra mazumdar --- kernel/sched/fair.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f35930f..b58f08f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6188,7 +6188,7 @@ static int select_idle_cpu

[PATCH v3 7/7] sched: use per-cpu variable cpumask_weight_sibling

2019-06-08 Thread subhra mazumdar
Use per-cpu var cpumask_weight_sibling for quick lookup in select_idle_cpu. This is the fast path of scheduler and every cycle is worth saving. Usage of cpumask_weight can result in iterations. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1

[PATCH v3 2/7] sched: introduce per-cpu var next_cpu to track search limit

2019-06-08 Thread subhra mazumdar
Introduce a per-cpu variable to track the limit upto which idle cpu search was done in select_idle_cpu(). This will help to start the search next time from there. This is necessary for rotating the search window over entire LLC domain. Signed-off-by: subhra mazumdar --- kernel/sched/core.c | 2

[PATCH v3 3/7] sched: rotate the cpu search window for better spread

2019-06-08 Thread subhra mazumdar
Rotate the cpu search window for better spread of threads. This will ensure an idle cpu will quickly be found if one exists. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched

[PATCH v3 0/7] Improve scheduler scalability for fast path

2019-06-08 Thread subhra mazumdar
ead of multiplication to compute limit -Use per-CPU variable to precompute the number of sibling SMTs for x86 subhra mazumdar (7): sched: limit cpu search in select_idle_cpu sched: introduce per-cpu var next_cpu to track search limit sched: rotate the cpu search window for better spread sched: add sc

[PATCH v3 6/7] x86/smpboot: introduce per-cpu variable for HT siblings

2019-06-08 Thread subhra mazumdar
Introduce a per-cpu variable to keep the number of HT siblings of a cpu. This will be used for quick lookup in select_idle_cpu to determine the limits of search. This patch does it only for x86. Signed-off-by: subhra mazumdar --- arch/x86/include/asm/smp.h | 1 + arch/x86/include/asm

[PATCH v3 4/7] sched: add sched feature to disable idle core search

2019-06-08 Thread subhra mazumdar
Add a new sched feature SIS_CORE to have an option to disable idle core search (select_idle_core). Signed-off-by: subhra mazumdar --- kernel/sched/features.h | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 858589b..de4d506 100644

Re: [RFC PATCH 2/3] sched: change scheduler to give preference to soft affinity CPUs

2019-07-18 Thread Subhra Mazumdar
On 7/18/19 5:07 PM, Peter Zijlstra wrote: On Wed, Jul 17, 2019 at 08:31:25AM +0530, Subhra Mazumdar wrote: On 7/2/19 10:58 PM, Peter Zijlstra wrote: On Wed, Jun 26, 2019 at 03:47:17PM -0700, subhra mazumdar wrote: The soft affinity CPUs present in the cpumask cpus_preferred is used

Re: [RFC PATCH 3/3] sched: introduce tunables to control soft affinity

2019-07-19 Thread Subhra Mazumdar
On 7/18/19 3:38 PM, Srikar Dronamraju wrote: * subhra mazumdar [2019-06-26 15:47:18]: For different workloads the optimal "softness" of soft affinity can be different. Introduce tunables sched_allowed and sched_preferred that can be tuned via /proc. This allows to chose at what u

  1   2   >