[PATCH v2 2/2] x86/sgx: Log information when a node lacks an EPC section

2024-09-05 Thread Aaron Lu
node with both CPUs and memory lacks an EPC section. This will provide users with a hint as to why they might be experiencing less-than-ideal performance when running SGX enclaves. Suggested-by: Dave Hansen Signed-off-by: Aaron Lu --- arch/x86/kernel/cpu/sgx/main.c | 7 +++ 1 file changed, 7

[PATCH v2 1/2] x86/sgx: Fix deadlock in SGX NUMA node search

2024-09-05 Thread Aaron Lu
ot;Molina Sabido, Gerardo" Tested-by: Zhimin Luo Reviewed-by: Kai Huang Acked-by: Dave Hansen Signed-off-by: Aaron Lu --- arch/x86/kernel/cpu/sgx/main.c | 27 ++- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x

[PATCH v2 0/2] SGX NUMA fix

2024-09-05 Thread Aaron Lu
ch2/2. Comments are welcome, thanks. v2: - Enhance changelog for patch1/2 according to Kai, Dave and Jarkko's suggestions; - Fix Reported-by tag, it should be Gerardo. Sorry for the mistake. - Collect review tags. - Add patch2/2. Aaron Lu (2): x86/sgx: Fix deadlock in SGX NUMA node search

Re: [PATCH] x86/sgx: Fix deadloop in __sgx_alloc_epc_page()

2024-09-03 Thread Aaron Lu
On Tue, Sep 03, 2024 at 07:05:40PM +0300, Jarkko Sakkinen wrote: > On Fri Aug 30, 2024 at 9:14 AM EEST, Aaron Lu wrote: > > On Thu, Aug 29, 2024 at 07:44:13PM +0300, Jarkko Sakkinen wrote: > > > On Thu Aug 29, 2024 at 5:38 AM EEST, Aaron Lu wrote: > > > > When c

Re: [PATCH] x86/sgx: Fix deadloop in __sgx_alloc_epc_page()

2024-09-02 Thread Aaron Lu
On Fri, Aug 30, 2024 at 07:03:33AM -0700, Dave Hansen wrote: > On 8/29/24 23:02, Aaron Lu wrote: > >> Also, I do think we should probably add some kind of sanity warning to > >> the SGX code in another patch. If a node on an SGX system has CPUs and > >> memory, it&#

Re: [PATCH] x86/sgx: Fix deadloop in __sgx_alloc_epc_page()

2024-08-29 Thread Aaron Lu
On Thu, Aug 29, 2024 at 07:44:13PM +0300, Jarkko Sakkinen wrote: > On Thu Aug 29, 2024 at 5:38 AM EEST, Aaron Lu wrote: > > When current node doesn't have a EPC section configured by firmware and > > all other EPC sections memory are used up, CPU can stuck inside t

Re: [PATCH] x86/sgx: Fix deadloop in __sgx_alloc_epc_page()

2024-08-29 Thread Aaron Lu
ct, thanks. > On 8/28/24 19:38, Aaron Lu wrote: > > When current node doesn't have a EPC section configured by firmware and > > all other EPC sections memory are used up, CPU can stuck inside the > > while loop in __sgx_alloc_epc_page() forever and soft lockup will hap

Re: [PATCH] x86/sgx: Fix deadloop in __sgx_alloc_epc_page()

2024-08-29 Thread Aaron Lu
On Thu, Aug 29, 2024 at 03:56:39PM +0800, Huang, Kai wrote: > Actually run spell check this time ... > > On Thu, 2024-08-29 at 10:38 +0800, Aaron Lu wrote: > > When current node doesn't have a EPC section configured by firmware and > > "current node" ->

[PATCH] x86/sgx: Fix deadloop in __sgx_alloc_epc_page()

2024-08-28 Thread Aaron Lu
gx: Add a basic NUMA allocation scheme to sgx_alloc_epc_page()") Reported-by: Zhimin Luo Tested-by: Zhimin Luo Signed-off-by: Aaron Lu --- This issue is found by Zhimin when doing internal testing and no external bug report has been sent out so there is no Closes: tag. arch/x86/k

Re: [PATCH v18 00/32] per memcg lru_lock

2020-09-08 Thread Aaron Lu
On Thu, Aug 27, 2020 at 09:40:22PM -0400, Daniel Jordan wrote: > I went back to your v1 post to see what motivated you originally, and you had > some results from aim9 but nothing about where this reared its head in the > first place. How did you discover the bottleneck? I'm just curious about ho

Re: [RFC PATCH 09/16] sched/fair: core wide cfs task priority comparison(Internet mail)

2020-07-24 Thread Aaron Lu
On Wed, Jul 22, 2020 at 12:23:44AM +, benbjiang(蒋彪) wrote: > > > > +/* > > + * This function takes care of adjusting the min_vruntime of siblings of > > + * a core during coresched enable/disable. > > + * This is called in stop machine context so no need to take the rq lock. > Hi, > > IMHO,

Re: [RFC PATCH 1/3 v2] futex: introduce FUTEX_SWAP operation

2020-06-23 Thread Aaron Lu
On Tue, Jun 16, 2020 at 10:22:26AM -0700, Peter Oskolkov wrote: > static void futex_wait_queue_me(struct futex_hash_bucket *hb, struct futex_q > *q, > - struct hrtimer_sleeper *timeout) > + struct hrtimer_sleeper *timeout, > +

Re: [RFC PATCH 0/3 v2] futex: introduce FUTEX_SWAP operation

2020-06-22 Thread Aaron Lu
On Tue, Jun 16, 2020 at 10:22:11AM -0700, Peter Oskolkov wrote: > From 7b091e46de4f9227b5a943e6d78283564e8c1c72 Mon Sep 17 00:00:00 2001 > From: Peter Oskolkov > Date: Tue, 16 Jun 2020 10:13:58 -0700 > Subject: [RFC PATCH 0/3 v2] futex: introduce FUTEX_SWAP operation > > This is an RFC! > > As P

Re: [PATCH updated v2] sched/fair: core wide cfs task priority comparison

2020-05-22 Thread Aaron Lu
On Sat, May 16, 2020 at 11:42:30AM +0800, Aaron Lu wrote: > On Thu, May 14, 2020 at 03:02:48PM +0200, Peter Zijlstra wrote: > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -4476,6 +4473,16 @@ next_class:; > > WARN_ON_ONCE(!cookie_

Re: [RFC PATCH 07/13] sched: Add core wide task selection and scheduling.

2020-05-21 Thread Aaron Lu
On Thu, May 21, 2020 at 10:35:56PM -0400, Joel Fernandes wrote: > Discussed a lot with Vineeth. Below is an improved version of the pick_task() > similification. > > It also handles the following "bug" in the existing code as well that Vineeth > brought up in OSPM: Suppose 2 siblings of a core: rq

Re: [PATCH updated v2] sched/fair: core wide cfs task priority comparison

2020-05-15 Thread Aaron Lu
On Thu, May 14, 2020 at 03:02:48PM +0200, Peter Zijlstra wrote: > On Fri, May 08, 2020 at 08:34:57PM +0800, Aaron Lu wrote: > > With this said, I realized a workaround for the issue described above: > > when the core went from 'compatible mode'(step 1-3) to 'incompa

Re: [PATCH updated v2] sched/fair: core wide cfs task priority comparison

2020-05-08 Thread Aaron Lu
On Fri, May 08, 2020 at 11:09:25AM +0200, Peter Zijlstra wrote: > On Fri, May 08, 2020 at 04:44:19PM +0800, Aaron Lu wrote: > > On Wed, May 06, 2020 at 04:35:06PM +0200, Peter Zijlstra wrote: > > > > Aside from this being way to complicated for what it does -- you >

Re: [PATCH updated v2] sched/fair: core wide cfs task priority comparison

2020-05-08 Thread Aaron Lu
On Wed, May 06, 2020 at 04:35:06PM +0200, Peter Zijlstra wrote: > > Sorry for being verbose; I've been procrastinating replying, and in > doing so the things I wanted to say kept growing. > > On Fri, Apr 24, 2020 at 10:24:43PM +0800, Aaron Lu wrote: > > > To make th

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-10-14 Thread Aaron Lu
On Sun, Oct 13, 2019 at 08:44:32AM -0400, Vineeth Remanan Pillai wrote: > On Fri, Oct 11, 2019 at 11:55 PM Aaron Lu wrote: > > > > > I don't think we need do the normalization afterwrads and it appears > > we are on the same page regarding core wide vruntime. Shou

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-10-11 Thread Aaron Lu
On Fri, Oct 11, 2019 at 08:10:30AM -0400, Vineeth Remanan Pillai wrote: > > Thanks for the clarification. > > > > Yes, this is the initialization issue I mentioned before when core > > scheduling is initially enabled. rq1's vruntime is bumped the first time > > update_core_cfs_rq_min_vruntime() is

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-10-11 Thread Aaron Lu
On Fri, Oct 11, 2019 at 07:32:48AM -0400, Vineeth Remanan Pillai wrote: > > > The reason we need to do this is because, new tasks that gets created will > > > have a vruntime based on the new min_vruntime and old tasks will have it > > > based on the old min_vruntime > > > > I think this is expecte

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-10-11 Thread Aaron Lu
On Thu, Oct 10, 2019 at 10:29:47AM -0400, Vineeth Remanan Pillai wrote: > > I didn't see why we need do this. > > > > We only need to have the root level sched entities' vruntime become core > > wide since we will compare vruntime for them across hyperthreads. For > > sched entities on sub cfs_rqs,

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-10-10 Thread Aaron Lu
On Wed, Oct 02, 2019 at 04:48:14PM -0400, Vineeth Remanan Pillai wrote: > On Mon, Sep 30, 2019 at 7:53 AM Vineeth Remanan Pillai > wrote: > > > > > > > Sorry, I misunderstood the fix and I did not initially see the core wide > > min_vruntime that you tried to maintain in the rq->core. This approac

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-15 Thread Aaron Lu
On Fri, Sep 13, 2019 at 07:12:52AM +0800, Aubrey Li wrote: > On Thu, Sep 12, 2019 at 8:04 PM Aaron Lu wrote: > > > > On Wed, Sep 11, 2019 at 09:19:02AM -0700, Tim Chen wrote: > > > On 9/11/19 7:02 AM, Aaron Lu wrote: > > > I think Julien's result show

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-13 Thread Aaron Lu
On Thu, Sep 12, 2019 at 10:29:13AM -0700, Tim Chen wrote: > On 9/12/19 5:35 AM, Aaron Lu wrote: > > On Wed, Sep 11, 2019 at 12:47:34PM -0400, Vineeth Remanan Pillai wrote: > > > > > core wide vruntime makes sense when there are multiple tasks of > > different cgroup

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-13 Thread Aaron Lu
On Thu, Sep 12, 2019 at 10:05:43AM -0700, Tim Chen wrote: > On 9/12/19 5:04 AM, Aaron Lu wrote: > > > Well, I have done following tests: > > 1 Julien's test script: https://paste.debian.net/plainh/834cf45c > > 2 start two tagged will-it-scale/page_fault1, see how

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-12 Thread Aaron Lu
On Wed, Sep 11, 2019 at 12:47:34PM -0400, Vineeth Remanan Pillai wrote: > > > So both of you are working on top of my 2 patches that deal with the > > > fairness issue, but I had the feeling Tim's alternative patches[1] are > > > simpler than mine and achieves the same result(after the force idle t

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-12 Thread Aaron Lu
On Wed, Sep 11, 2019 at 09:19:02AM -0700, Tim Chen wrote: > On 9/11/19 7:02 AM, Aaron Lu wrote: > > Hi Tim & Julien, > > > > On Fri, Sep 06, 2019 at 11:30:20AM -0700, Tim Chen wrote: > >> On 8/7/19 10:10 AM, Tim Chen wrote: > >>

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-11 Thread Aaron Lu
Hi Tim & Julien, On Fri, Sep 06, 2019 at 11:30:20AM -0700, Tim Chen wrote: > On 8/7/19 10:10 AM, Tim Chen wrote: > > > 3) Load balancing between CPU cores > > --- > > Say if one CPU core's sibling threads get forced idled > > a lot as it has mostly incompatible tas

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-15 Thread Aaron Lu
On Thu, Aug 15, 2019 at 06:09:28PM +0200, Dario Faggioli wrote: > On Wed, 2019-08-07 at 10:10 -0700, Tim Chen wrote: > > On 8/7/19 1:58 AM, Dario Faggioli wrote: > > > > > Since I see that, in this thread, there are various patches being > > > proposed and discussed... should I rerun my benchmarks

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-12 Thread Aaron Lu
On 2019/8/12 23:38, Vineeth Remanan Pillai wrote: >> I have two other small changes that I think are worth sending out. >> >> The first simplify logic in pick_task() and the 2nd avoid task pick all >> over again when max is preempted. I also refined the previous hack patch to >> make schedule alway

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-10 Thread Aaron Lu
On Thu, Aug 08, 2019 at 09:39:45AM -0700, Tim Chen wrote: > On 8/8/19 5:55 AM, Aaron Lu wrote: > > On Mon, Aug 05, 2019 at 08:55:28AM -0700, Tim Chen wrote: > >> On 8/2/19 8:37 AM, Julien Desfossez wrote: > >>> We tested both Aaron's and Tim's patches an

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-10 Thread Aaron Lu
On Thu, Aug 08, 2019 at 02:42:57PM -0700, Tim Chen wrote: > On 8/8/19 10:27 AM, Tim Chen wrote: > > On 8/7/19 11:47 PM, Aaron Lu wrote: > >> On Tue, Aug 06, 2019 at 02:19:57PM -0700, Tim Chen wrote: > >>> +void account_core_idletime(struct task_struct *p, u64 exec)

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-08 Thread Aaron Lu
On Mon, Aug 05, 2019 at 08:55:28AM -0700, Tim Chen wrote: > On 8/2/19 8:37 AM, Julien Desfossez wrote: > > We tested both Aaron's and Tim's patches and here are our results. > > > > Test setup: > > - 2 1-thread sysbench, one running the cpu benchmark, the other one the > > mem benchmark > > - bo

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-07 Thread Aaron Lu
On Tue, Aug 06, 2019 at 02:19:57PM -0700, Tim Chen wrote: > +void account_core_idletime(struct task_struct *p, u64 exec) > +{ > + const struct cpumask *smt_mask; > + struct rq *rq; > + bool force_idle, refill; > + int i, cpu; > + > + rq = task_rq(p); > + if (!sched_core_enab

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-06 Thread Aaron Lu
On 2019/8/6 22:17, Phil Auld wrote: > On Tue, Aug 06, 2019 at 09:54:01PM +0800 Aaron Lu wrote: >> On Mon, Aug 05, 2019 at 04:09:15PM -0400, Phil Auld wrote: >>> Hi, >>> >>> On Fri, Aug 02, 2019 at 11:37:15AM -0400 Julien Desfossez wrote: >>>> We te

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-06 Thread Aaron Lu
On Mon, Aug 05, 2019 at 04:09:15PM -0400, Phil Auld wrote: > Hi, > > On Fri, Aug 02, 2019 at 11:37:15AM -0400 Julien Desfossez wrote: > > We tested both Aaron's and Tim's patches and here are our results. > > > > Test setup: > > - 2 1-thread sysbench, one running the cpu benchmark, the other one

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-06 Thread Aaron Lu
On Tue, Aug 06, 2019 at 08:24:17AM -0400, Vineeth Remanan Pillai wrote: > > > > > > I also think a way to make fairness per cookie per core, is this what you > > > want to propose? > > > > Yes, that's what I meant. > > I think that would hurt some kind of workloads badly, especially if > one tenan

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-06 Thread Aaron Lu
On 2019/8/6 14:56, Aubrey Li wrote: > On Tue, Aug 6, 2019 at 11:24 AM Aaron Lu wrote: >> I've been thinking if we should consider core wide tenent fairness? >> >> Let's say there are 3 tasks on 2 threads' rq of the same core, 2 tasks >> (e.g. A1, A2) b

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-05 Thread Aaron Lu
On Mon, Aug 05, 2019 at 08:55:28AM -0700, Tim Chen wrote: > On 8/2/19 8:37 AM, Julien Desfossez wrote: > > We tested both Aaron's and Tim's patches and here are our results. > > > > Test setup: > > - 2 1-thread sysbench, one running the cpu benchmark, the other one the > > mem benchmark > > - bo

[PATCH 3/3] temp hack to make tick based schedule happen

2019-07-25 Thread Aaron Lu
and if so, do a schedule. Signed-off-by: Aaron Lu --- kernel/sched/fair.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 43babc2a12a5..730c9359e9c9 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4093,6

[PATCH 2/3] core vruntime comparison

2019-07-25 Thread Aaron Lu
The two values can differ greatly and can cause tasks with a large vruntime starve. So enable core scheduling early when the system is still kind of idle for the time being to avoid this problem. Signed-off-by: Aaron Lu --- kernel/sched/core.c | 15 ++---

[RFC PATCH 1/3] wrapper for cfs_rq->min_vruntime

2019-07-25 Thread Aaron Lu
Add a wrapper function cfs_rq_min_vruntime(cfs_rq) to return cfs_rq->min_vruntime. It will be used in the following patch, no functionality change. Signed-off-by: Aaron Lu --- kernel/sched/fair.c | 27 --- 1 file changed, 16 insertions(+), 11 deletions(-) diff --gi

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-07-25 Thread Aaron Lu
On Mon, Jul 22, 2019 at 06:26:46PM +0800, Aubrey Li wrote: > The granularity period of util_avg seems too large to decide task priority > during pick_task(), at least it is in my case, cfs_prio_less() always picked > core max task, so pick_task() eventually picked idle, which causes this change > n

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-07-22 Thread Aaron Lu
On 2019/7/22 18:26, Aubrey Li wrote: > The granularity period of util_avg seems too large to decide task priority > during pick_task(), at least it is in my case, cfs_prio_less() always picked > core max task, so pick_task() eventually picked idle, which causes this change > not very helpful for my

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-07-18 Thread Aaron Lu
On Thu, Jul 18, 2019 at 04:27:19PM -0700, Tim Chen wrote: > > > On 7/18/19 3:07 AM, Aaron Lu wrote: > > On Wed, Jun 19, 2019 at 02:33:02PM -0400, Julien Desfossez wrote: > > > > > With the below patch on top of v3 that makes use of util_avg to decide > >

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-07-18 Thread Aaron Lu
On Wed, Jun 19, 2019 at 02:33:02PM -0400, Julien Desfossez wrote: > On 17-Jun-2019 10:51:27 AM, Aubrey Li wrote: > > The result looks still unfair, and particularly, the variance is too high, > > I just want to confirm that I am also seeing the same issue with a > similar setup. I also tried with

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-05-31 Thread Aaron Lu
On Fri, May 31, 2019 at 02:53:21PM +0800, Aubrey Li wrote: > On Fri, May 31, 2019 at 2:09 PM Aaron Lu wrote: > > > > On 2019/5/31 13:12, Aubrey Li wrote: > > > On Fri, May 31, 2019 at 11:01 AM Aaron Lu > > > wrote: > > >> > > >> This f

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-05-30 Thread Aaron Lu
On 2019/5/31 13:12, Aubrey Li wrote: > On Fri, May 31, 2019 at 11:01 AM Aaron Lu wrote: >> >> This feels like "date" failed to schedule on some CPU >> on time. >> >> My first reaction is: when shell wakes up from sleep, it will >> fork date.

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-05-30 Thread Aaron Lu
On 2019/5/30 22:04, Aubrey Li wrote: > On Thu, May 30, 2019 at 4:36 AM Vineeth Remanan Pillai > wrote: >> >> Third iteration of the Core-Scheduling feature. >> >> This version fixes mostly correctness related issues in v2 and >> addresses performance issues. Also, addressed some crashes related >>

Re: [RFC PATCH v2 00/17] Core scheduling v2

2019-05-08 Thread Aaron Lu
On Wed, May 08, 2019 at 01:49:09PM -0400, Julien Desfossez wrote: > On 08-May-2019 10:30:09 AM, Aaron Lu wrote: > > On Mon, May 06, 2019 at 03:39:37PM -0400, Julien Desfossez wrote: > > > On 29-Apr-2019 11:53:21 AM, Aaron Lu wrote: > > > > This is what I have used

Re: [RFC PATCH v2 00/17] Core scheduling v2

2019-05-07 Thread Aaron Lu
On Mon, May 06, 2019 at 03:39:37PM -0400, Julien Desfossez wrote: > On 29-Apr-2019 11:53:21 AM, Aaron Lu wrote: > > This is what I have used to make sure no two unmatched tasks being > > scheduled on the same core: (on top of v1, I thinks it's easier to just > > show the

Re: [RFC PATCH v2 13/17] sched: Add core wide task selection and scheduling.

2019-04-29 Thread Aaron Lu
On Tue, Apr 23, 2019 at 04:18:18PM +, Vineeth Remanan Pillai wrote: > +// XXX fairness/fwd progress conditions > +static struct task_struct * > +pick_task(struct rq *rq, const struct sched_class *class, struct task_struct > *max) > +{ > + struct task_struct *class_pick, *cookie_pick; > +

Re: [RFC PATCH v2 11/17] sched: Basic tracking of matching tasks

2019-04-28 Thread Aaron Lu
On Tue, Apr 23, 2019 at 04:18:16PM +, Vineeth Remanan Pillai wrote: > +/* > + * Find left-most (aka, highest priority) task matching @cookie. > + */ > +struct task_struct *sched_core_find(struct rq *rq, unsigned long cookie) > +{ > + struct rb_node *node = rq->core_tree.rb_node; > + str

Re: [RFC PATCH v2 09/17] sched: Introduce sched_class::pick_task()

2019-04-28 Thread Aaron Lu
On Tue, Apr 23, 2019 at 04:18:14PM +, Vineeth Remanan Pillai wrote: > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index c055bad249a9..45d86b862750 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4132,7 +4132,7 @@ pick_next_entity(struct cfs_rq *cfs_rq, struct

Re: [RFC PATCH v2 00/17] Core scheduling v2

2019-04-28 Thread Aaron Lu
On Tue, Apr 23, 2019 at 06:45:27PM +, Vineeth Remanan Pillai wrote: > >> - Processes with different tags can still share the core > > > I may have missed something... Could you explain this statement? > > > This, to me, is the whole point of the patch series. If it's not > > doing this then .

Re: [RFC PATCH v2 11/17] sched: Basic tracking of matching tasks

2019-04-28 Thread Aaron Lu
On Tue, Apr 23, 2019 at 04:18:16PM +, Vineeth Remanan Pillai wrote: > +/* > + * l(a,b) > + * le(a,b) := !l(b,a) > + * g(a,b) := l(b,a) > + * ge(a,b) := !l(a,b) > + */ > + > +/* real prio, less is less */ > +static inline bool __prio_less(struct task_struct *a, struct task_struct *b, > bool co

Re: [RFC][PATCH 13/16] sched: Add core wide task selection and scheduling.

2019-04-16 Thread Aaron Lu
On Tue, Apr 02, 2019 at 10:28:12AM +0200, Peter Zijlstra wrote: > On Tue, Apr 02, 2019 at 02:46:13PM +0800, Aaron Lu wrote: ... > > Perhaps we can test if max is on the same cpu as class_pick and then > > use cpu_prio_less() or core_prio_less() accordingly here, or just > > r

Re: [RFC][PATCH 13/16] sched: Add core wide task selection and scheduling.

2019-04-10 Thread Aaron Lu
On Wed, Apr 10, 2019 at 04:44:18PM +0200, Peter Zijlstra wrote: > On Wed, Apr 10, 2019 at 12:36:33PM +0800, Aaron Lu wrote: > > On Tue, Apr 09, 2019 at 11:09:45AM -0700, Tim Chen wrote: > > > Now that we have accumulated quite a number of different fixes to your > > > o

Re: [RFC][PATCH 13/16] sched: Add core wide task selection and scheduling.

2019-04-10 Thread Aaron Lu
On Wed, Apr 10, 2019 at 10:18:10PM +0800, Aubrey Li wrote: > On Wed, Apr 10, 2019 at 12:36 PM Aaron Lu wrote: > > > > On Tue, Apr 09, 2019 at 11:09:45AM -0700, Tim Chen wrote: > > > Now that we have accumulated quite a number of different fixes to your > > > org

Re: [RFC][PATCH 13/16] sched: Add core wide task selection and scheduling.

2019-04-09 Thread Aaron Lu
On Tue, Apr 09, 2019 at 11:09:45AM -0700, Tim Chen wrote: > Now that we have accumulated quite a number of different fixes to your orginal > posted patches. Would you like to post a v2 of the core scheduler with the > fixes? One more question I'm not sure: should a task with cookie=0, i.e. tasks

Re: [RFC][PATCH 13/16] sched: Add core wide task selection and scheduling.

2019-04-05 Thread Aaron Lu
On Tue, Apr 02, 2019 at 10:28:12AM +0200, Peter Zijlstra wrote: > Another approach would be something like the below: > > > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -87,7 +87,7 @@ static inline int __task_prio(struct tas > */ > > /* real prio, less is less */ > -static inli

Re: [RFC][PATCH 13/16] sched: Add core wide task selection and scheduling.

2019-04-02 Thread Aaron Lu
On Tue, Apr 02, 2019 at 10:28:12AM +0200, Peter Zijlstra wrote: > Another approach would be something like the below: > > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -87,7 +87,7 @@ static inline int __task_prio(struct tas > */ > > /* real prio, less is less */ > -static inline

Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-03-26 Thread Aaron Lu
On Tue, Mar 26, 2019 at 03:32:12PM +0800, Aaron Lu wrote: > On Fri, Mar 08, 2019 at 11:44:01AM -0800, Subhra Mazumdar wrote: > > > > On 2/22/19 4:45 AM, Mel Gorman wrote: > > >On Mon, Feb 18, 2019 at 09:49:10AM -0800, Linus Torvalds wrote: > > >>On Mon, Fe

Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-03-26 Thread Aaron Lu
On Fri, Mar 08, 2019 at 11:44:01AM -0800, Subhra Mazumdar wrote: > > On 2/22/19 4:45 AM, Mel Gorman wrote: > >On Mon, Feb 18, 2019 at 09:49:10AM -0800, Linus Torvalds wrote: > >>On Mon, Feb 18, 2019 at 9:40 AM Peter Zijlstra wrote: > >>>However; whichever way around you turn this cookie; it is ex

Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-03-12 Thread Aaron Lu
On Mon, Mar 11, 2019 at 05:20:19PM -0700, Greg Kerr wrote: > On Mon, Mar 11, 2019 at 4:36 PM Subhra Mazumdar > wrote: > > > > > > On 3/11/19 11:34 AM, Subhra Mazumdar wrote: > > > > > > On 3/10/19 9:23 PM, Aubrey Li wrote: > > >> On Sat, Mar 9, 2019 at 3:50 AM Subhra Mazumdar > > >> wrote: > > >>

Re: [RFC PATCH 4/4] mm: Add merge page notifier

2019-02-11 Thread Aaron Lu
On 2019/2/11 23:58, Alexander Duyck wrote: > On Mon, 2019-02-11 at 14:40 +0800, Aaron Lu wrote: >> On 2019/2/5 2:15, Alexander Duyck wrote: >>> From: Alexander Duyck >>> >>> Because the implementation was limiting itself to only providing hints on >>

Re: [RFC PATCH 4/4] mm: Add merge page notifier

2019-02-10 Thread Aaron Lu
On 2019/2/5 2:15, Alexander Duyck wrote: > From: Alexander Duyck > > Because the implementation was limiting itself to only providing hints on > pages huge TLB order sized or larger we introduced the possibility for free > pages to slip past us because they are freed as something less then > huge

[PATCH v2 RESEND 1/2] mm/page_alloc: free order-0 pages through PCP in page_frag_free()

2018-11-19 Thread Aaron Lu
page_frag_free() calls __free_pages_ok() to free the page back to Buddy. This is OK for high order page, but for order-0 pages, it misses the optimization opportunity of using Per-Cpu-Pages and can cause zone lock contention when called frequently. Paweł Staszewski recently shared his result of 'h

[PATCH v3 RESEND 2/2] mm/page_alloc: use a single function to free page

2018-11-19 Thread Aaron Lu
There are multiple places of freeing a page, they all do the same things so a common function can be used to reduce code duplicate. It also avoids bug fixed in one function but left in another. Acked-by: Vlastimil Babka Signed-off-by: Aaron Lu --- mm/page_alloc.c | 37

[PATCH RESEND 0/2] free order-0 pages through PCP in page_frag_free() and cleanup

2018-11-19 Thread Aaron Lu
single function to free page https://lkml.kernel.org/r/20181106113149.gc24...@intel.com With some changelog rewording. Applies on top of v4.20-rc2-mmotm-2018-11-16-14-52. Aaron Lu (2): mm/page_alloc: free order-0 pages through PCP in page_frag_free() mm/page_alloc: use a single function to

[PATCH] mm/swap: use nr_node_ids for avail_lists in swap_info_struct

2018-11-15 Thread Aaron Lu
ruct is still needed in case nr_node_ids is really big on some systems. Cc: Vasily Averin Cc: Michal Hocko Cc: Huang Ying Signed-off-by: Aaron Lu --- include/linux/swap.h | 11 ++- mm/swapfile.c| 3 ++- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/include/li

Re: [LKP] [bpf] fd978bf7fd: will-it-scale.per_process_ops -4.0% regression

2018-11-08 Thread Aaron Lu
On Fri, Nov 09, 2018 at 08:19:54AM +0800, Rong Chen wrote: > > > On 11/02/2018 04:36 PM, Daniel Borkmann wrote: > > Hi Rong, > > > > On 11/02/2018 03:14 AM, kernel test robot wrote: > > > Greeting, > > > > > > FYI, we noticed a -4.0% regression of will-it-scale.per_process_ops due > > > to com

Re: [PATCH v2 2/2] mm/page_alloc: use a single function to free page

2018-11-06 Thread Aaron Lu
On Tue, Nov 06, 2018 at 10:32:00AM +0100, Vlastimil Babka wrote: > On 11/6/18 9:47 AM, Aaron Lu wrote: > > On Tue, Nov 06, 2018 at 09:16:20AM +0100, Vlastimil Babka wrote: > >> On 11/6/18 6:30 AM, Aaron Lu wrote: > >>> We have multiple places of freeing a pa

Re: [PATCH v2] mm: use kvzalloc for swap_info_struct allocation

2018-11-05 Thread Aaron Lu
to add > Fixes: a2468cc9bfdf ("swap: choose swap device according to numa node") > > because not being able to add a swap space on a fragmented system looks > like a regression to me. Agree, especially it used to work. Regards, Aaron > > Acked-by: Aaron Lu >

Re: [PATCH 1/2] mm: use kvzalloc for swap_info_struct allocation

2018-11-04 Thread Aaron Lu
I didn't realize this problem when developing this patch, thanks for pointing this out. I think using kvzalloc() as is done by your patch is better here as it can avoid possible failure of swapon. Acked-by: Aaron Lu BTW, for systems with few swap devices this may not be a big deal, but accordin

Re: [LKP] [lkp-robot] [sched/fair] d519329f72: unixbench.score -9.9% regression

2018-10-25 Thread Aaron Lu
On Wed, Oct 24, 2018 at 06:01:37PM +0100, Patrick Bellasi wrote: > On 24-Oct 14:41, Aaron Lu wrote: > > On Mon, Apr 02, 2018 at 11:20:00AM +0800, Ye, Xiaolong wrote: > > > > > > Greeting, > > > > > > FYI, we noticed a -9.9% regression of unixbench.sc

Re: [LKP] [lkp-robot] [sched/fair] d519329f72: unixbench.score -9.9% regression

2018-10-23 Thread Aaron Lu
On Mon, Apr 02, 2018 at 11:20:00AM +0800, Ye, Xiaolong wrote: > > Greeting, > > FYI, we noticed a -9.9% regression of unixbench.score due to commit: > > > commit: d519329f72a6f36bc4f2b85452640cfe583b4f81 ("sched/fair: Update > util_est only on util_avg updates") > https://git.kernel.org/cgit/l

Re: [RFC v4 PATCH 3/5] mm/rmqueue_bulk: alloc without touching individual page structure

2018-10-22 Thread Aaron Lu
On Mon, Oct 22, 2018 at 11:37:53AM +0200, Vlastimil Babka wrote: > On 10/17/18 8:33 AM, Aaron Lu wrote: > > Profile on Intel Skylake server shows the most time consuming part > > under zone->lock on allocation path is accessing those to-be-returned > > page's "st

Re: [RFC v4 PATCH 2/5] mm/__free_one_page: skip merge for order-0 page unless compaction failed

2018-10-20 Thread Aaron Lu
On Fri, Oct 19, 2018 at 08:00:53AM -0700, Daniel Jordan wrote: > On Fri, Oct 19, 2018 at 09:54:35AM +0100, Mel Gorman wrote: > > On Fri, Oct 19, 2018 at 01:57:03PM +0800, Aaron Lu wrote: > > > > > > > > I don't think this is the right way of thinking about it

Re: [RFC v4 PATCH 2/5] mm/__free_one_page: skip merge for order-0 page unless compaction failed

2018-10-18 Thread Aaron Lu
On Thu, Oct 18, 2018 at 12:16:32PM +0100, Mel Gorman wrote: > On Wed, Oct 17, 2018 at 10:59:04PM +0800, Aaron Lu wrote: > > > Any particuular reason why? I assume it's related to the number of zone > > > locks with the increase number of zones and the number of threa

Re: [RFC v4 PATCH 3/5] mm/rmqueue_bulk: alloc without touching individual page structure

2018-10-18 Thread Aaron Lu
On Thu, Oct 18, 2018 at 12:20:55PM +0100, Mel Gorman wrote: > On Wed, Oct 17, 2018 at 10:23:27PM +0800, Aaron Lu wrote: > > > RT has had problems with cpu_relax in the past but more importantly, as > > > this delay for parallel compactions and allocations of contig ranges, &g

Re: [RFC v4 PATCH 2/5] mm/__free_one_page: skip merge for order-0 page unless compaction failed

2018-10-18 Thread Aaron Lu
On Thu, Oct 18, 2018 at 10:23:22AM +0200, Vlastimil Babka wrote: > On 10/18/18 8:48 AM, Aaron Lu wrote: > > On Wed, Oct 17, 2018 at 07:03:30PM +0200, Vlastimil Babka wrote: > >> On 10/17/18 3:58 PM, Mel Gorman wrote: > >>> Again, as compaction is not guaranteed to

Re: [RFC v4 PATCH 2/5] mm/__free_one_page: skip merge for order-0 page unless compaction failed

2018-10-17 Thread Aaron Lu
On Wed, Oct 17, 2018 at 07:03:30PM +0200, Vlastimil Babka wrote: > On 10/17/18 3:58 PM, Mel Gorman wrote: > > Again, as compaction is not guaranteed to find the pageblocks, it would > > be important to consider whether a) that matters or b) find an > > alternative way of keeping unmerged buddies on

Re: [RFC v4 PATCH 2/5] mm/__free_one_page: skip merge for order-0 page unless compaction failed

2018-10-17 Thread Aaron Lu
On Wed, Oct 17, 2018 at 02:58:07PM +0100, Mel Gorman wrote: > On Wed, Oct 17, 2018 at 09:10:59PM +0800, Aaron Lu wrote: > > On Wed, Oct 17, 2018 at 11:44:27AM +0100, Mel Gorman wrote: > > > On Wed, Oct 17, 2018 at 02:33:27PM +0800, Aaron Lu wrote: > > > > Running wil

Re: [RFC v4 PATCH 3/5] mm/rmqueue_bulk: alloc without touching individual page structure

2018-10-17 Thread Aaron Lu
On Wed, Oct 17, 2018 at 12:20:42PM +0100, Mel Gorman wrote: > On Wed, Oct 17, 2018 at 02:33:28PM +0800, Aaron Lu wrote: > > Profile on Intel Skylake server shows the most time consuming part > > under zone->lock on allocation path is accessing those to-be-returned > > pag

Re: [RFC v4 PATCH 2/5] mm/__free_one_page: skip merge for order-0 page unless compaction failed

2018-10-17 Thread Aaron Lu
On Wed, Oct 17, 2018 at 11:44:27AM +0100, Mel Gorman wrote: > On Wed, Oct 17, 2018 at 02:33:27PM +0800, Aaron Lu wrote: > > Running will-it-scale/page_fault1 process mode workload on a 2 sockets > > Intel Skylake server showed severe lock contention of zone->lock, as > >

[RFC v4 PATCH 5/5] mm/can_skip_merge(): make it more aggressive to attempt cluster alloc/free

2018-10-16 Thread Aaron Lu
ree when problem would occur, e.g. when compaction is in progress. Signed-off-by: Aaron Lu --- mm/internal.h | 4 1 file changed, 4 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index fb4e8f7976e5..309a3f43e613 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -538,10 +538,6

[RFC v4 PATCH 1/5] mm/page_alloc: use helper functions to add/remove a page to/from buddy

2018-10-16 Thread Aaron Lu
There are multiple places that add/remove a page into/from buddy, introduce helper functions for them. This also makes it easier to add code when a page is added/removed to/from buddy. No functionality change. Acked-by: Vlastimil Babka Signed-off-by: Aaron Lu --- mm/page_alloc.c | 65

[RFC v4 PATCH 4/5] mm/free_pcppages_bulk: reduce overhead of cluster operation on free path

2018-10-16 Thread Aaron Lu
es_bulk(), we can avoid calling add_to_cluster() one time per page but adding them in one go as a single cluster so this patch just did this. This optimization brings zone->lock contention down from 25% to almost zero again using the parallel free workload. Signed-off-by: Aaron Lu --- mm/p

[RFC v4 PATCH 3/5] mm/rmqueue_bulk: alloc without touching individual page structure

2018-10-16 Thread Aaron Lu
could eliminate zone->lock contention entirely but at the same time, pgdat->lru_lock contention rose to 82%. Final performance increased about 8.3%. Suggested-by: Ying Huang Suggested-by: Dave Hansen Signed-off-by: Aaron Lu --- include/linux/mm_types.h | 19 +-- include/linux/mmzon

[RFC v4 PATCH 2/5] mm/__free_one_page: skip merge for order-0 page unless compaction failed

2018-10-16 Thread Aaron Lu
. Though performance dropped a little, it almost eliminated zone lock contention on free path and it is the foundation for the next patch that eliminates zone lock contention for allocation path. Suggested-by: Dave Hansen Signed-off-by: Aaron Lu --- include/linux/mm_types.h | 9 +++- mm/compa

[RFC v4 PATCH 0/5] Eliminate zone->lock contention for will-it-scale/page_fault1 and parallel free

2018-10-16 Thread Aaron Lu
//lkml.kernel.org/r/1489568404-7817-1-git-send-email-aaron...@intel.com A branch is maintained here in case someone wants to give it a try: https://github.com/aaronlu/linux no_merge_cluster_alloc_4.19-rc5 v4: - rebased to v4.19-rc5; - add numbers from netperf(courtesy of Tariq Toukan) Aaron Lu (

Re: [RFC PATCH 0/9] Improve zone lock scalability using Daniel Jordan's list work

2018-09-24 Thread Aaron Lu
On Fri, Sep 21, 2018 at 10:45:36AM -0700, Daniel Jordan wrote: > On Tue, Sep 11, 2018 at 01:36:07PM +0800, Aaron Lu wrote: > > Daniel Jordan and others proposed an innovative technique to make > > multiple threads concurrently use list_del() at any position of the > > list a

[RFC PATCH 8/9] mm: use smp_list_splice() on free path

2018-09-10 Thread Aaron Lu
page and we lose the merge oppotunity for them. With this patch, we will have mergable pages unmerged in Buddy. Due to this, I don't see much value of keeping the range lock which is used to avoid such thing from happening, so the range lock is removed in this patch. Signed-off-by: Aar

[RFC PATCH 1/9] mm: do not add anon pages to LRU

2018-09-10 Thread Aaron Lu
For the sake of testing purpose, do not add anon pages to LRU to avoid LRU lock so we can test zone lock exclusively. Signed-off-by: Aaron Lu --- mm/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index c467102a5cbc..080641255b8b 100644

[RFC PATCH 2/9] mm: introduce smp_list_del for concurrent list entry removals

2018-09-10 Thread Aaron Lu
From: Daniel Jordan Now that the LRU lock is a RW lock, lay the groundwork for fine-grained synchronization so that multiple threads holding the lock as reader can safely remove pages from an LRU at the same time. Add a thread-safe variant of list_del called smp_list_del that allows multiple thr

[RFC PATCH 5/9] mm/page_alloc: use helper functions to add/remove a page to/from buddy

2018-09-10 Thread Aaron Lu
There are multiple places that add/remove a page into/from buddy, introduce helper functions for them. This also makes it easier to add code when a page is added/removed to/from buddy. No functionality change. Acked-by: Vlastimil Babka Signed-off-by: Aaron Lu --- mm/page_alloc.c | 65

[RFC PATCH 0/9] Improve zone lock scalability using Daniel Jordan's list work

2018-09-10 Thread Aaron Lu
inate zone lock contention entirely, but has worse fragmentation issue. [0] https://lwn.net/Articles/753058/ [1] https://lkml.kernel.org/r/20180911004240.4758-1-daniel.m.jor...@oracle.com [2] https://lkml.kernel.org/r/20180509085450.3524-1-aaron...@intel.com Aaron Lu (7): mm: do not add anon pages to LR

[RFC PATCH 3/9] mm: introduce smp_list_splice to prepare for concurrent LRU adds

2018-09-10 Thread Aaron Lu
From: Daniel Jordan Now that we splice a local list onto the LRU, prepare for multiple tasks doing this concurrently by adding a variant of the kernel's list splicing API, list_splice, that's designed to work with multiple tasks. Although there is naturally less parallelism to be gained from loc

  1   2   3   4   5   6   7   8   >