Re: [RFC PATCH 3/9] sched: add sched feature to disable idle core search

2019-09-05 Thread Subhra Mazumdar
On 9/5/19 3:17 AM, Patrick Bellasi wrote: On Fri, Aug 30, 2019 at 18:49:38 +0100, subhra mazumdar wrote... Add a new sched feature SIS_CORE to have an option to disable idle core search (select_idle_core). Signed-off-by: subhra mazumdar --- kernel/sched/features.h | 1 + 1 file changed

Re: [RFC PATCH 7/9] sched: search SMT before LLC domain

2019-09-05 Thread Subhra Mazumdar
On 9/5/19 2:31 AM, Peter Zijlstra wrote: On Fri, Aug 30, 2019 at 10:49:42AM -0700, subhra mazumdar wrote: Search SMT siblings before all CPUs in LLC domain for idle CPU. This helps in L1 cache locality. --- kernel/sched/fair.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff

[RFC PATCH 3/9] sched: add sched feature to disable idle core search

2019-08-30 Thread subhra mazumdar
Add a new sched feature SIS_CORE to have an option to disable idle core search (select_idle_core). Signed-off-by: subhra mazumdar --- kernel/sched/features.h | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 858589b..de4d506 100644 --- a

[RFC PATCH 6/9] x86/smpboot: Optimize cpumask_weight_sibling macro for x86

2019-08-30 Thread subhra mazumdar
through it to find the number of bits and gets it in O(1). Signed-off-by: subhra mazumdar --- arch/x86/include/asm/smp.h | 1 + arch/x86/include/asm/topology.h | 1 + arch/x86/kernel/smpboot.c | 17 - 3 files changed, 18 insertions(+), 1 deletion(-) diff --git a

[RFC PATCH 5/9] sched: Define macro for number of CPUs in core

2019-08-30 Thread subhra mazumdar
Introduce macro topology_sibling_weight for number of sibling CPUs in a core and use in select_idle_cpu Signed-off-by: subhra mazumdar --- include/linux/topology.h | 4 kernel/sched/fair.c | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/include/linux/topology.h b

[RFC PATCH 0/9] Task latency-nice

2019-08-30 Thread subhra mazumdar
319.45 (22.87%) 333.95 (28.44%) 128 431.1 437.69 (1.53%) 431.09 (0%) subhra mazumdar (9): sched,cgroup: Add interface for latency-nice sched: add search limit as per latency-nice sched: add sched feature to disable idle core search sched: SIS_CORE to disable

[RFC PATCH 7/9] sched: search SMT before LLC domain

2019-08-30 Thread subhra mazumdar
Search SMT siblings before all CPUs in LLC domain for idle CPU. This helps in L1 cache locality. --- kernel/sched/fair.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8856503..94dd4a32 100644 --- a/kernel/sched/fair.c +++ b/

[RFC PATCH 8/9] sched: introduce per-cpu var next_cpu to track search limit

2019-08-30 Thread subhra mazumdar
Introduce a per-cpu variable to track the limit upto which idle cpu search was done in select_idle_cpu(). This will help to start the search next time from there. This is necessary for rotating the search window over entire LLC domain. Signed-off-by: subhra mazumdar --- kernel/sched/core.c | 2

[RFC PATCH 4/9] sched: SIS_CORE to disable idle core search

2019-08-30 Thread subhra mazumdar
Use SIS_CORE to disable idle core search. For some workloads select_idle_core becomes a scalability bottleneck, removing it improves throughput. Also there are workloads where disabling it can hurt latency, so need to have an option. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 8

[RFC PATCH 2/9] sched: add search limit as per latency-nice

2019-08-30 Thread subhra mazumdar
change the search cost making it appropriate for given workload. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index b08d00c..c31082d 100644 --- a/kernel/sched/fair.c

[RFC PATCH 1/9] sched,cgroup: Add interface for latency-nice

2019-08-30 Thread subhra mazumdar
Add Cgroup interface for latency-nice. Each CPU Cgroup adds a new file "latency-nice" which is shared by all the threads in that Cgroup. Signed-off-by: subhra mazumdar --- include/linux/sched.h | 1 + kernel/sched/core.c | 40 kernel/sc

[RFC PATCH 9/9] sched: rotate the cpu search window for better spread

2019-08-30 Thread subhra mazumdar
Rotate the cpu search window for better spread of threads. This will ensure an idle cpu will quickly be found if one exists. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched

Panic on v5.3-rc4

2019-08-15 Thread Subhra Mazumdar
I am getting the following panic during boot of tag v5.3-rc4 of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git. I don't see the panic on tag v5.2 on same rig. Is it a bug or something legitimately changed? Thanks, Subhra [  147.184948] dracut Warning: No root device "block:/de

Re: [RFC PATCH 3/3] sched: introduce tunables to control soft affinity

2019-07-19 Thread Subhra Mazumdar
On 7/18/19 3:38 PM, Srikar Dronamraju wrote: * subhra mazumdar [2019-06-26 15:47:18]: For different workloads the optimal "softness" of soft affinity can be different. Introduce tunables sched_allowed and sched_preferred that can be tuned via /proc. This allows to chose at what u

Re: [RFC PATCH 2/3] sched: change scheduler to give preference to soft affinity CPUs

2019-07-18 Thread Subhra Mazumdar
On 7/18/19 5:07 PM, Peter Zijlstra wrote: On Wed, Jul 17, 2019 at 08:31:25AM +0530, Subhra Mazumdar wrote: On 7/2/19 10:58 PM, Peter Zijlstra wrote: On Wed, Jun 26, 2019 at 03:47:17PM -0700, subhra mazumdar wrote: The soft affinity CPUs present in the cpumask cpus_preferred is used by the

Re: [RFC PATCH 2/3] sched: change scheduler to give preference to soft affinity CPUs

2019-07-16 Thread Subhra Mazumdar
On 7/2/19 10:58 PM, Peter Zijlstra wrote: On Wed, Jun 26, 2019 at 03:47:17PM -0700, subhra mazumdar wrote: The soft affinity CPUs present in the cpumask cpus_preferred is used by the scheduler in two levels of search. First is in determining wake affine which choses the LLC domain and

Re: [PATCH v3 5/7] sched: SIS_CORE to disable idle core search

2019-07-13 Thread Subhra Mazumdar
On 7/4/19 6:04 PM, Parth Shah wrote: Same experiment with hackbench and with perf analysis shows increase in L1 cache miss rate with these patches (Lower is better) Baseline(%) Patch(%) --- - --- Total Cache miss rate

Re: [RFC 0/2] Optimize the idle CPU search

2019-07-08 Thread Subhra Mazumdar
On 7/9/19 11:08 AM, Parth Shah wrote: On 7/9/19 5:38 AM, Subhra Mazumdar wrote: On 7/8/19 10:24 AM, Parth Shah wrote: When searching for an idle_sibling, scheduler first iterates to search for an idle core and then for an idle CPU. By maintaining the idle CPU mask while iterating through

Re: [RFC 0/2] Optimize the idle CPU search

2019-07-08 Thread Subhra Mazumdar
On 7/8/19 1:38 PM, Peter Zijlstra wrote: On Mon, Jul 08, 2019 at 10:24:30AM +0530, Parth Shah wrote: When searching for an idle_sibling, scheduler first iterates to search for an idle core and then for an idle CPU. By maintaining the idle CPU mask while iterating through idle cores, we can mar

Re: [RFC 0/2] Optimize the idle CPU search

2019-07-08 Thread Subhra Mazumdar
On 7/8/19 10:24 AM, Parth Shah wrote: When searching for an idle_sibling, scheduler first iterates to search for an idle core and then for an idle CPU. By maintaining the idle CPU mask while iterating through idle cores, we can mark non-idle CPUs for which idle CPU search would not have to iter

Re: [RESEND PATCH v3 0/7] Improve scheduler scalability for fast path

2019-07-02 Thread Subhra Mazumdar
On 7/2/19 1:54 AM, Patrick Bellasi wrote: Wondering if searching and preempting needs will ever be conflicting? I guess the winning point is that we don't commit behaviors to userspace, but just abstract concepts which are turned into biases. I don't see conflicts right now: if you are latency

Re: [PATCH V3 2/2] sched/fair: Fallback to sched-idle CPU if idle CPU isn't found

2019-07-02 Thread Subhra Mazumdar
On 7/2/19 1:35 AM, Peter Zijlstra wrote: On Mon, Jul 01, 2019 at 03:08:41PM -0700, Subhra Mazumdar wrote: On 7/1/19 1:03 AM, Viresh Kumar wrote: On 28-06-19, 18:16, Subhra Mazumdar wrote: On 6/25/19 10:06 PM, Viresh Kumar wrote: @@ -5376,6 +5376,15 @@ static struct { #endif

Re: [RESEND PATCH v3 0/7] Improve scheduler scalability for fast path

2019-07-01 Thread Subhra Mazumdar
On 7/1/19 6:55 AM, Patrick Bellasi wrote: On 01-Jul 11:02, Peter Zijlstra wrote: On Wed, Jun 26, 2019 at 06:29:12PM -0700, subhra mazumdar wrote: Hi, Resending this patchset, will be good to get some feedback. Any suggestions that will make it more acceptable are welcome. We have been

Re: [PATCH V3 2/2] sched/fair: Fallback to sched-idle CPU if idle CPU isn't found

2019-07-01 Thread Subhra Mazumdar
On 7/1/19 1:03 AM, Viresh Kumar wrote: On 28-06-19, 18:16, Subhra Mazumdar wrote: On 6/25/19 10:06 PM, Viresh Kumar wrote: We try to find an idle CPU to run the next task, but in case we don't find an idle CPU it is better to pick a CPU which will run the task the soonest, for perfor

Re: [PATCH v3 5/7] sched: SIS_CORE to disable idle core search

2019-07-01 Thread Subhra Mazumdar
Also, systems like POWER9 has sd_llc as a pair of core only. So it won't benefit from the limits and hence also hiding your code in select_idle_cpu behind static keys will be much preferred. If it doesn't hurt then I don't see the point. So these is the result from POWER9 system with your pa

Re: [PATCH V3 2/2] sched/fair: Fallback to sched-idle CPU if idle CPU isn't found

2019-06-28 Thread Subhra Mazumdar
On 6/25/19 10:06 PM, Viresh Kumar wrote: We try to find an idle CPU to run the next task, but in case we don't find an idle CPU it is better to pick a CPU which will run the task the soonest, for performance reason. A CPU which isn't idle but has only SCHED_IDLE activity queued on it should be

Re: [PATCH v3 3/7] sched: rotate the cpu search window for better spread

2019-06-28 Thread Subhra Mazumdar
On 6/28/19 4:54 AM, Srikar Dronamraju wrote: * subhra mazumdar [2019-06-26 18:29:15]: Rotate the cpu search window for better spread of threads. This will ensure an idle cpu will quickly be found if one exists. While rotating the cpu search window is good, not sure if this can find a idle

Re: [PATCH v3 5/7] sched: SIS_CORE to disable idle core search

2019-06-28 Thread Subhra Mazumdar
On 6/28/19 12:01 PM, Parth Shah wrote: On 6/27/19 6:59 AM, subhra mazumdar wrote: Use SIS_CORE to disable idle core search. For some workloads select_idle_core becomes a scalability bottleneck, removing it improves throughput. Also there are workloads where disabling it can hurt latency, so

Re: [PATCH v3 1/7] sched: limit cpu search in select_idle_cpu

2019-06-28 Thread Subhra Mazumdar
On 6/28/19 11:47 AM, Parth Shah wrote: On 6/27/19 6:59 AM, subhra mazumdar wrote: Put upper and lower limit on cpu search of select_idle_cpu. The lower limit is amount of cpus in a core while upper limit is twice that. This ensures for any architecture we will usually search beyond a core

Re: [PATCH v3 3/7] sched: rotate the cpu search window for better spread

2019-06-28 Thread Subhra Mazumdar
On 6/28/19 11:36 AM, Parth Shah wrote: Hi Subhra, I ran your patch series on IBM POWER systems and this is what I have observed. On 6/27/19 6:59 AM, subhra mazumdar wrote: Rotate the cpu search window for better spread of threads. This will ensure an idle cpu will quickly be found if one

Re: [PATCH v3 6/7] x86/smpboot: introduce per-cpu variable for HT siblings

2019-06-27 Thread Subhra Mazumdar
On 6/26/19 11:54 PM, Thomas Gleixner wrote: On Thu, 27 Jun 2019, Thomas Gleixner wrote: On Wed, 26 Jun 2019, subhra mazumdar wrote: Introduce a per-cpu variable to keep the number of HT siblings of a cpu. This will be used for quick lookup in select_idle_cpu to determine the limits of

Re: [PATCH v3 6/7] x86/smpboot: introduce per-cpu variable for HT siblings

2019-06-27 Thread Subhra Mazumdar
On 6/26/19 11:51 PM, Thomas Gleixner wrote: On Wed, 26 Jun 2019, subhra mazumdar wrote: Introduce a per-cpu variable to keep the number of HT siblings of a cpu. This will be used for quick lookup in select_idle_cpu to determine the limits of search. Why? The number of siblings is constant

[PATCH v3 7/7] sched: use per-cpu variable cpumask_weight_sibling

2019-06-26 Thread subhra mazumdar
Use per-cpu var cpumask_weight_sibling for quick lookup in select_idle_cpu. This is the fast path of scheduler and every cycle is worth saving. Usage of cpumask_weight can result in iterations. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1

[RESEND PATCH v3 0/7] Improve scheduler scalability for fast path

2019-06-26 Thread subhra mazumdar
SMTs for x86 subhra mazumdar (7): sched: limit cpu search in select_idle_cpu sched: introduce per-cpu var next_cpu to track search limit sched: rotate the cpu search window for better spread sched: add sched feature to disable idle core search sched: SIS_CORE to disable idle core sear

[PATCH v3 1/7] sched: limit cpu search in select_idle_cpu

2019-06-26 Thread subhra mazumdar
: subhra mazumdar --- kernel/sched/fair.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f35930f..b58f08f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6188,7 +6188,7 @@ static int select_idle_cpu

[PATCH v3 6/7] x86/smpboot: introduce per-cpu variable for HT siblings

2019-06-26 Thread subhra mazumdar
Introduce a per-cpu variable to keep the number of HT siblings of a cpu. This will be used for quick lookup in select_idle_cpu to determine the limits of search. This patch does it only for x86. Signed-off-by: subhra mazumdar --- arch/x86/include/asm/smp.h | 1 + arch/x86/include/asm

[PATCH v3 3/7] sched: rotate the cpu search window for better spread

2019-06-26 Thread subhra mazumdar
Rotate the cpu search window for better spread of threads. This will ensure an idle cpu will quickly be found if one exists. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched

[PATCH v3 4/7] sched: add sched feature to disable idle core search

2019-06-26 Thread subhra mazumdar
Add a new sched feature SIS_CORE to have an option to disable idle core search (select_idle_core). Signed-off-by: subhra mazumdar --- kernel/sched/features.h | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 858589b..de4d506 100644 --- a

[PATCH v3 5/7] sched: SIS_CORE to disable idle core search

2019-06-26 Thread subhra mazumdar
Use SIS_CORE to disable idle core search. For some workloads select_idle_core becomes a scalability bottleneck, removing it improves throughput. Also there are workloads where disabling it can hurt latency, so need to have an option. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 8

[PATCH v3 2/7] sched: introduce per-cpu var next_cpu to track search limit

2019-06-26 Thread subhra mazumdar
Introduce a per-cpu variable to track the limit upto which idle cpu search was done in select_idle_cpu(). This will help to start the search next time from there. This is necessary for rotating the search window over entire LLC domain. Signed-off-by: subhra mazumdar --- kernel/sched/core.c | 2

[RFC PATCH 1/3] sched: Introduce new interface for scheduler soft affinity

2019-06-26 Thread subhra mazumdar
boolean affinity_unequal is used to store if they are unequal for fast lookup. Setting hard affinity resets soft affinity set to be equal to it. Soft affinity is only allowed for CFS class threads. Signed-off-by: subhra mazumdar --- arch/x86/entry/syscalls/syscall_64.tbl | 1 + include/linux

[RFC PATCH 2/3] sched: change scheduler to give preference to soft affinity CPUs

2019-06-26 Thread subhra mazumdar
; together they achieve the "softness" of scheduling. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 137 ++-- 1 file changed, 100 insertions(+), 37 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f35930

[RFC PATCH 0/3] Scheduler Soft Affinity

2019-06-26 Thread subhra mazumdar
NUMA nodes. This showed similar improvements with soft affinity for 2 instance case, thus proving the improvement is due to saving LLC coherence overhead. subhra mazumdar (3): sched: Introduce new interface for scheduler soft affinity sched: change scheduler to give preference to soft affinity CP

[RFC PATCH 3/3] sched: introduce tunables to control soft affinity

2019-06-26 Thread subhra mazumdar
t level of search. Depending on the extent of data sharing, cache coherency overhead of the system etc. the optimal point may vary. Signed-off-by: subhra mazumdar --- include/linux/sched/sysctl.h | 2 ++ kernel/sched/fair.c | 19 ++- kernel/sched/sched.h | 2

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-06-13 Thread Subhra Mazumdar
On 6/12/19 9:33 AM, Julien Desfossez wrote: After reading more traces and trying to understand why only untagged tasks are starving when there are cpu-intensive tasks running on the same set of CPUs, we noticed a difference in behavior in ‘pick_task’. In the case where ‘core_cookie’ is 0, we ar

[PATCH v3 4/7] sched: add sched feature to disable idle core search

2019-06-08 Thread subhra mazumdar
Add a new sched feature SIS_CORE to have an option to disable idle core search (select_idle_core). Signed-off-by: subhra mazumdar --- kernel/sched/features.h | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 858589b..de4d506 100644 --- a

[PATCH v3 6/7] x86/smpboot: introduce per-cpu variable for HT siblings

2019-06-08 Thread subhra mazumdar
Introduce a per-cpu variable to keep the number of HT siblings of a cpu. This will be used for quick lookup in select_idle_cpu to determine the limits of search. This patch does it only for x86. Signed-off-by: subhra mazumdar --- arch/x86/include/asm/smp.h | 1 + arch/x86/include/asm

[PATCH v3 5/7] sched: SIS_CORE to disable idle core search

2019-06-08 Thread subhra mazumdar
Use SIS_CORE to disable idle core search. For some workloads select_idle_core becomes a scalability bottleneck, removing it improves throughput. Also there are workloads where disabling it can hurt latency, so need to have an option. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 8

[PATCH v3 1/7] sched: limit cpu search in select_idle_cpu

2019-06-08 Thread subhra mazumdar
: subhra mazumdar --- kernel/sched/fair.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f35930f..b58f08f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6188,7 +6188,7 @@ static int select_idle_cpu

[PATCH v3 7/7] sched: use per-cpu variable cpumask_weight_sibling

2019-06-08 Thread subhra mazumdar
Use per-cpu var cpumask_weight_sibling for quick lookup in select_idle_cpu. This is the fast path of scheduler and every cycle is worth saving. Usage of cpumask_weight can result in iterations. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1

[PATCH v3 2/7] sched: introduce per-cpu var next_cpu to track search limit

2019-06-08 Thread subhra mazumdar
Introduce a per-cpu variable to track the limit upto which idle cpu search was done in select_idle_cpu(). This will help to start the search next time from there. This is necessary for rotating the search window over entire LLC domain. Signed-off-by: subhra mazumdar --- kernel/sched/core.c | 2

[PATCH v3 3/7] sched: rotate the cpu search window for better spread

2019-06-08 Thread subhra mazumdar
Rotate the cpu search window for better spread of threads. This will ensure an idle cpu will quickly be found if one exists. Signed-off-by: subhra mazumdar --- kernel/sched/fair.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched

[PATCH v3 0/7] Improve scheduler scalability for fast path

2019-06-08 Thread subhra mazumdar
stead of multiplication to compute limit -Use per-CPU variable to precompute the number of sibling SMTs for x86 subhra mazumdar (7): sched: limit cpu search in select_idle_cpu sched: introduce per-cpu var next_cpu to track search limit sched: rotate the cpu search window for better spread sched: a

Re: [RFC V2 2/2] sched/fair: Fallback to sched-idle CPU if idle CPU isn't found

2019-05-14 Thread Subhra Mazumdar
On 5/14/19 10:27 AM, Subhra Mazumdar wrote: On 5/14/19 9:03 AM, Steven Sistare wrote: On 5/13/2019 7:35 AM, Peter Zijlstra wrote: On Mon, May 13, 2019 at 03:04:18PM +0530, Viresh Kumar wrote: On 10-05-19, 09:21, Peter Zijlstra wrote: I don't hate his per se; but the

Re: [RFC V2 2/2] sched/fair: Fallback to sched-idle CPU if idle CPU isn't found

2019-05-14 Thread Subhra Mazumdar
On 5/14/19 9:03 AM, Steven Sistare wrote: On 5/13/2019 7:35 AM, Peter Zijlstra wrote: On Mon, May 13, 2019 at 03:04:18PM +0530, Viresh Kumar wrote: On 10-05-19, 09:21, Peter Zijlstra wrote: I don't hate his per se; but the whole select_idle_sibling() thing is something that needs looking at.

Re: [RFC PATCH v2 11/17] sched: Basic tracking of matching tasks

2019-05-09 Thread Subhra Mazumdar
select_task_rq_* seems to be unchanged. So the search logic to find a cpu to enqueue when a task becomes runnable is same as before and doesn't do any kind of cookie matching. Okay, that's true in task wakeup path, and also load_balance seems to pull task without checking cookie too. But my sy

Re: [RFC PATCH v2 11/17] sched: Basic tracking of matching tasks

2019-05-08 Thread Subhra Mazumdar
On 5/8/19 6:38 PM, Aubrey Li wrote: On Thu, May 9, 2019 at 8:29 AM Subhra Mazumdar wrote: On 5/8/19 5:01 PM, Aubrey Li wrote: On Thu, May 9, 2019 at 2:41 AM Subhra Mazumdar wrote: On 5/8/19 11:19 AM, Subhra Mazumdar wrote: On 5/8/19 8:49 AM, Aubrey Li wrote: Pawan ran an experiment

Re: [RFC PATCH v2 11/17] sched: Basic tracking of matching tasks

2019-05-08 Thread Subhra Mazumdar
On 5/8/19 5:01 PM, Aubrey Li wrote: On Thu, May 9, 2019 at 2:41 AM Subhra Mazumdar wrote: On 5/8/19 11:19 AM, Subhra Mazumdar wrote: On 5/8/19 8:49 AM, Aubrey Li wrote: Pawan ran an experiment setting up 2 VMs, with one VM doing a parallel kernel build and one VM doing sysbench, limiting

Re: [RFC PATCH v2 11/17] sched: Basic tracking of matching tasks

2019-05-08 Thread Subhra Mazumdar
On 5/8/19 11:19 AM, Subhra Mazumdar wrote: On 5/8/19 8:49 AM, Aubrey Li wrote: Pawan ran an experiment setting up 2 VMs, with one VM doing a parallel kernel build and one VM doing sysbench, limiting both VMs to run on 16 cpu threads (8 physical cores), with 8 vcpu for each VM. Making the

Re: [RFC PATCH v2 11/17] sched: Basic tracking of matching tasks

2019-05-08 Thread Subhra Mazumdar
On 5/8/19 8:49 AM, Aubrey Li wrote: Pawan ran an experiment setting up 2 VMs, with one VM doing a parallel kernel build and one VM doing sysbench, limiting both VMs to run on 16 cpu threads (8 physical cores), with 8 vcpu for each VM. Making the fix did improve kernel build time by 7%. I'm g

Re: [RFC PATCH v2 00/17] Core scheduling v2

2019-04-26 Thread Subhra Mazumdar
On 4/26/19 3:43 AM, Mel Gorman wrote: On Fri, Apr 26, 2019 at 10:42:22AM +0200, Ingo Molnar wrote: It should, but it's not perfect. For example, wake_affine_idle does not take sibling activity into account even though select_idle_sibling *may* take it into account. Even select_idle_sibling in

Re: [RFC][PATCH 13/16] sched: Add core wide task selection and scheduling.

2019-04-19 Thread Subhra Mazumdar
On 4/19/19 1:40 AM, Ingo Molnar wrote: * Subhra Mazumdar wrote: I see similar improvement with this patch as removing the condition I earlier mentioned. So that's not needed. I also included the patch for the priority fix. For 2 DB instances, HT disabling stands at -22% for 32 users

Re: [RFC][PATCH 13/16] sched: Add core wide task selection and scheduling.

2019-04-10 Thread Subhra Mazumdar
On 4/9/19 11:38 AM, Julien Desfossez wrote: We found the source of the major performance regression we discussed previously. It turns out there was a pattern where a task (a kworker in this case) could be woken up, but the core could still end up idle before that task had a chance to run. Exam

Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access

2019-04-04 Thread Subhra Mazumdar
We tried to comment those lines and it doesn’t seem to get rid of the performance regression we are seeing. Can you elaborate a bit more about the test you are performing, what kind of resources it uses ? I am running 1 and 2 Oracle DB instances each running TPC-C workload. The clients driving

Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access

2019-04-01 Thread Subhra Mazumdar
On 3/29/19 3:23 PM, Subhra Mazumdar wrote: On 3/29/19 6:35 AM, Julien Desfossez wrote: On Fri, Mar 22, 2019 at 8:09 PM Subhra Mazumdar wrote: Is the core wide lock primarily responsible for the regression? I ran upto patch 12 which also has the core wide lock for tagged cgroups and also

Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access

2019-03-29 Thread Subhra Mazumdar
On 3/29/19 6:35 AM, Julien Desfossez wrote: On Fri, Mar 22, 2019 at 8:09 PM Subhra Mazumdar wrote: Is the core wide lock primarily responsible for the regression? I ran upto patch 12 which also has the core wide lock for tagged cgroups and also calls newidle_balance() from pick_next_task

Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access

2019-03-26 Thread Subhra Mazumdar
On 3/22/19 5:06 PM, Subhra Mazumdar wrote: On 3/21/19 2:20 PM, Julien Desfossez wrote: On Tue, Mar 19, 2019 at 10:31 PM Subhra Mazumdar wrote: On 3/18/19 8:41 AM, Julien Desfossez wrote: On further investigation, we could see that the contention is mostly in the way rq locks are taken

Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access

2019-03-22 Thread Subhra Mazumdar
On 3/21/19 2:20 PM, Julien Desfossez wrote: On Tue, Mar 19, 2019 at 10:31 PM Subhra Mazumdar wrote: On 3/18/19 8:41 AM, Julien Desfossez wrote: On further investigation, we could see that the contention is mostly in the way rq locks are taken. With this patchset, we lock the whole core if

Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access

2019-03-19 Thread Subhra Mazumdar
On 3/18/19 8:41 AM, Julien Desfossez wrote: The case where we try to acquire the lock on 2 runqueues belonging to 2 different cores requires the rq_lockp wrapper as well otherwise we frequently deadlock in there. This fixes the crash reported in 1552577311-8218-1-git-send-email-jdesfos...@digi

Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-03-11 Thread Subhra Mazumdar
On 3/11/19 5:20 PM, Greg Kerr wrote: On Mon, Mar 11, 2019 at 4:36 PM Subhra Mazumdar wrote: On 3/11/19 11:34 AM, Subhra Mazumdar wrote: On 3/10/19 9:23 PM, Aubrey Li wrote: On Sat, Mar 9, 2019 at 3:50 AM Subhra Mazumdar wrote: expected. Most of the performance recovery happens in patch

Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-03-11 Thread Subhra Mazumdar
On 3/11/19 11:34 AM, Subhra Mazumdar wrote: On 3/10/19 9:23 PM, Aubrey Li wrote: On Sat, Mar 9, 2019 at 3:50 AM Subhra Mazumdar wrote: expected. Most of the performance recovery happens in patch 15 which, unfortunately, is also the one that introduces the hard lockup. After applied

Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-03-11 Thread Subhra Mazumdar
On 3/10/19 9:23 PM, Aubrey Li wrote: On Sat, Mar 9, 2019 at 3:50 AM Subhra Mazumdar wrote: expected. Most of the performance recovery happens in patch 15 which, unfortunately, is also the one that introduces the hard lockup. After applied Subhra's patch, the following is trigger

Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-03-08 Thread Subhra Mazumdar
On 2/22/19 4:45 AM, Mel Gorman wrote: On Mon, Feb 18, 2019 at 09:49:10AM -0800, Linus Torvalds wrote: On Mon, Feb 18, 2019 at 9:40 AM Peter Zijlstra wrote: However; whichever way around you turn this cookie; it is expensive and nasty. Do you (or anybody else) have numbers for real loads? B

Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-02-28 Thread Subhra Mazumdar
On 2/18/19 8:56 AM, Peter Zijlstra wrote: A much 'demanded' feature: core-scheduling :-( I still hate it with a passion, and that is part of why it took a little longer than 'promised'. While this one doesn't have all the 'features' of the previous (never published) version and isn't L1TF 'co

Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-02-21 Thread Subhra Mazumdar
On 2/21/19 6:03 AM, Peter Zijlstra wrote: On Wed, Feb 20, 2019 at 06:53:08PM -0800, Subhra Mazumdar wrote: On 2/18/19 9:49 AM, Linus Torvalds wrote: On Mon, Feb 18, 2019 at 9:40 AM Peter Zijlstra wrote: However; whichever way around you turn this cookie; it is expensive and nasty. Do you

Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-02-21 Thread Subhra Mazumdar
On 2/21/19 6:03 AM, Peter Zijlstra wrote: On Wed, Feb 20, 2019 at 06:53:08PM -0800, Subhra Mazumdar wrote: On 2/18/19 9:49 AM, Linus Torvalds wrote: On Mon, Feb 18, 2019 at 9:40 AM Peter Zijlstra wrote: However; whichever way around you turn this cookie; it is expensive and nasty. Do you

Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-02-20 Thread Subhra Mazumdar
On 2/18/19 9:49 AM, Linus Torvalds wrote: On Mon, Feb 18, 2019 at 9:40 AM Peter Zijlstra wrote: However; whichever way around you turn this cookie; it is expensive and nasty. Do you (or anybody else) have numbers for real loads? Because performance is all that matters. If performance is bad

Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-02-20 Thread Subhra Mazumdar
On 2/20/19 1:42 AM, Peter Zijlstra wrote: A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing in e-mail? On Tue, Feb 19, 2019 at 02:07:01PM -0800, Greg Kerr wrote: Thanks for posting t

Re: Gang scheduling

2019-02-12 Thread Subhra Mazumdar
Hi Tim, On 10/12/18 11:01 AM, Tim Chen wrote: On 10/10/2018 05:09 PM, Subhra Mazumdar wrote: Hi, I was following the Coscheduling patch discussion on lkml and Peter mentioned he had a patch series. I found the following on github. https://github.com/pdxChen/gang/commits/sched_1.23-loadbal

Re: [RFC PATCH v2 1/1] pipe: busy wait for pipe

2018-11-05 Thread Subhra Mazumdar
On 11/5/18 2:08 AM, Mel Gorman wrote: Adding Al Viro as per get_maintainers.pl. On Tue, Sep 25, 2018 at 04:32:40PM -0700, subhra mazumdar wrote: Introduce pipe_ll_usec field for pipes that indicates the amount of micro seconds a thread should spin if pipe is empty or full before sleeping

Re: [PATCH 00/10] steal tasks to improve CPU utilization

2018-11-02 Thread Subhra Mazumdar
On 10/22/18 7:59 AM, Steve Sistare wrote: When a CPU has no more CFS tasks to run, and idle_balance() fails to find a task, then attempt to steal a task from an overloaded CPU in the same LLC. Maintain and use a bitmap of overloaded CPUs to efficiently identify candidates. To minimize search t

Re: [RFC 00/60] Coscheduling for Linux

2018-10-29 Thread Subhra Mazumdar
On 10/26/18 4:44 PM, Jan H. Schönherr wrote: On 19/10/2018 02.26, Subhra Mazumdar wrote: Hi Jan, Hi. Sorry for the delay. On 9/7/18 2:39 PM, Jan H. Schönherr wrote: The collective context switch from one coscheduled set of tasks to another -- while fast -- is not atomic. If a use-case

Re: [RFC 00/60] Coscheduling for Linux

2018-10-26 Thread Subhra Mazumdar
D) What can I *not* do with this? - Besides the missing load-balancing within coscheduled task-groups, this implementation has the following properties, which might be considered short-comings. This particular implementation focuses on SCHED_OTHER tasks manage

Re: [RFC 00/60] Coscheduling for Linux

2018-10-18 Thread Subhra Mazumdar
Hi Jan, On 9/7/18 2:39 PM, Jan H. Schönherr wrote: The collective context switch from one coscheduled set of tasks to another -- while fast -- is not atomic. If a use-case needs the absolute guarantee that all tasks of the previous set have stopped executing before any task of the next set start

Re: Gang scheduling

2018-10-15 Thread Subhra Mazumdar
On 10/12/2018 11:01 AM, Tim Chen wrote: On 10/10/2018 05:09 PM, Subhra Mazumdar wrote: Hi, I was following the Coscheduling patch discussion on lkml and Peter mentioned he had a patch series. I found the following on github. https://github.com/pdxChen/gang/commits/sched_1.23-loadbal I

Gang scheduling

2018-10-10 Thread Subhra Mazumdar
Hi, I was following the Coscheduling patch discussion on lkml and Peter mentioned he had a patch series. I found the following on github. https://github.com/pdxChen/gang/commits/sched_1.23-loadbal I would like to test this with KVMs. Are the commits from 38d5acb to f019876 sufficient? Also i

Re: [RFC 00/60] Coscheduling for Linux

2018-09-27 Thread Subhra Mazumdar
On 09/26/2018 02:58 AM, Jan H. Schönherr wrote: On 09/17/2018 02:25 PM, Peter Zijlstra wrote: On Fri, Sep 14, 2018 at 06:25:44PM +0200, Jan H. Schönherr wrote: Assuming, there is a cgroup-less solution that can prevent simultaneous execution of tasks on a core, when they're not supposed to.

Re: [RFC 00/60] Coscheduling for Linux

2018-09-27 Thread Subhra Mazumdar
On 09/24/2018 08:43 AM, Jan H. Schönherr wrote: On 09/19/2018 11:53 PM, Subhra Mazumdar wrote: Can we have a more generic interface, like specifying a set of task ids to be co-scheduled with a particular level rather than tying this with cgroups? KVMs may not always run with cgroups and

[RFC PATCH v2 1/1] pipe: busy wait for pipe

2018-09-25 Thread subhra mazumdar
. Other similar usecases can benefit. A tunable pipe_busy_poll is introduced to enable or disable busy waiting via /proc. The value of it specifies the amount of spin in microseconds. Default value is 0 indicating no spin. Signed-off-by: subhra mazumdar --- fs/pipe.c | 12

[RFC PATCH v2 0/1] Pipe busy wait

2018-09-25 Thread subhra mazumdar
ead -Added usr+sys time for hackbench runs as shown by time command subhra mazumdar (1): pipe: busy wait for pipe fs/pipe.c | 12 include/linux/pipe_fs_i.h | 2 ++ kernel/sysctl.c | 7 +++ 3 files changed, 21 insertions(+) -- 2.9.3

Re: [RFC 00/60] Coscheduling for Linux

2018-09-19 Thread Subhra Mazumdar
On 09/18/2018 04:44 AM, Jan H. Schönherr wrote: On 09/18/2018 02:33 AM, Subhra Mazumdar wrote: On 09/07/2018 02:39 PM, Jan H. Schönherr wrote: A) Quickstart guide for the impatient. -- Here is a quickstart guide to set up coscheduling at core-level for

Re: [RFC PATCH 2/2] pipe: use pipe busy wait

2018-09-17 Thread Subhra Mazumdar
On 09/17/2018 03:43 PM, Peter Zijlstra wrote: On Mon, Sep 17, 2018 at 02:05:40PM -0700, Subhra Mazumdar wrote: On 09/07/2018 05:25 AM, Peter Zijlstra wrote: Why not just busy wait on current->state ? A little something like: diff --git a/fs/pipe.c b/fs/pipe.c index bdc5d3c09

Re: [RFC 00/60] Coscheduling for Linux

2018-09-17 Thread Subhra Mazumdar
On 09/07/2018 02:39 PM, Jan H. Schönherr wrote: This patch series extends CFS with support for coscheduling. The implementation is versatile enough to cover many different coscheduling use-cases, while at the same time being non-intrusive, so that behavior of legacy workloads does not change.

Re: [RFC PATCH 2/2] pipe: use pipe busy wait

2018-09-17 Thread Subhra Mazumdar
On 09/07/2018 05:25 AM, Peter Zijlstra wrote: On Thu, Aug 30, 2018 at 01:24:58PM -0700, subhra mazumdar wrote: +void pipe_busy_wait(struct pipe_inode_info *pipe) +{ + unsigned long wait_flag = pipe->pipe_wait_flag; + unsigned long start_time = pipe_busy_loop_current_t

Re: [RFC PATCH 1/2] pipe: introduce busy wait for pipe

2018-09-04 Thread Subhra Mazumdar
On 08/31/2018 09:09 AM, Steven Sistare wrote: On 8/30/2018 4:24 PM, subhra mazumdar wrote: Introduce pipe_ll_usec field for pipes that indicates the amount of micro seconds a thread should spin if pipe is empty or full before sleeping. This is similar to network sockets. Workloads like

Re: [RFC PATCH 2/2] pipe: use pipe busy wait

2018-09-04 Thread Subhra Mazumdar
On 09/04/2018 02:54 PM, Thomas Gleixner wrote: On Thu, 30 Aug 2018, subhra mazumdar wrote: +void pipe_busy_wait(struct pipe_inode_info *pipe) +{ + unsigned long wait_flag = pipe->pipe_wait_flag; + unsigned long start_time = pipe_busy_loop_current_time(); + + pipe_unl

Re: [RFC PATCH 0/2] Pipe busy wait

2018-08-30 Thread Subhra Mazumdar
On 08/30/2018 01:24 PM, subhra mazumdar wrote: This patch introduces busy waiting for pipes similar to network sockets. When pipe is full or empty a thread busy waits for some microseconds before sleeping. This avoids the sleep and wakeup overhead and improves performance in case wakeup

[RFC PATCH 2/2] pipe: use pipe busy wait

2018-08-30 Thread subhra mazumdar
. The value of it specifies the amount of spin in microseconds. Signed-off-by: subhra mazumdar --- fs/pipe.c | 58 +-- include/linux/pipe_fs_i.h | 1 + kernel/sysctl.c | 7 ++ 3 files changed, 64 insertions(+), 2 deletions

[RFC PATCH 1/2] pipe: introduce busy wait for pipe

2018-08-30 Thread subhra mazumdar
. Other similar usecases can benefit. pipe_wait_flag is used to signal any thread busy waiting. pipe_busy_loop_timeout checks if spin time is over. Signed-off-by: subhra mazumdar --- include/linux/pipe_fs_i.h | 19 +++ 1 file changed, 19 insertions(+) diff --git a/include/linux

[RFC PATCH 0/2] Pipe busy wait

2018-08-30 Thread subhra mazumdar
%) subhra mazumdar (2): pipe: introduce busy wait for pipe pipe: use pipe busy wait fs/pipe.c | 58 +-- include/linux/pipe_fs_i.h | 20 kernel/sysctl.c | 7 ++ 3 files changed, 83 insertions(+), 2

  1   2   >