node with both CPUs and memory
lacks an EPC section. This will provide users with a hint as to why they
might be experiencing less-than-ideal performance when running SGX
enclaves.
Suggested-by: Dave Hansen
Signed-off-by: Aaron Lu
---
arch/x86/kernel/cpu/sgx/main.c | 7 +++
1 file changed, 7
ot;Molina Sabido, Gerardo"
Tested-by: Zhimin Luo
Reviewed-by: Kai Huang
Acked-by: Dave Hansen
Signed-off-by: Aaron Lu
---
arch/x86/kernel/cpu/sgx/main.c | 27 ++-
1 file changed, 14 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x
ch2/2.
Comments are welcome, thanks.
v2:
- Enhance changelog for patch1/2 according to Kai, Dave and Jarkko's
suggestions;
- Fix Reported-by tag, it should be Gerardo. Sorry for the mistake.
- Collect review tags.
- Add patch2/2.
Aaron Lu (2):
x86/sgx: Fix deadlock in SGX NUMA node search
On Tue, Sep 03, 2024 at 07:05:40PM +0300, Jarkko Sakkinen wrote:
> On Fri Aug 30, 2024 at 9:14 AM EEST, Aaron Lu wrote:
> > On Thu, Aug 29, 2024 at 07:44:13PM +0300, Jarkko Sakkinen wrote:
> > > On Thu Aug 29, 2024 at 5:38 AM EEST, Aaron Lu wrote:
> > > > When c
On Fri, Aug 30, 2024 at 07:03:33AM -0700, Dave Hansen wrote:
> On 8/29/24 23:02, Aaron Lu wrote:
> >> Also, I do think we should probably add some kind of sanity warning to
> >> the SGX code in another patch. If a node on an SGX system has CPUs and
> >> memory, it
On Thu, Aug 29, 2024 at 07:44:13PM +0300, Jarkko Sakkinen wrote:
> On Thu Aug 29, 2024 at 5:38 AM EEST, Aaron Lu wrote:
> > When current node doesn't have a EPC section configured by firmware and
> > all other EPC sections memory are used up, CPU can stuck inside t
ct, thanks.
> On 8/28/24 19:38, Aaron Lu wrote:
> > When current node doesn't have a EPC section configured by firmware and
> > all other EPC sections memory are used up, CPU can stuck inside the
> > while loop in __sgx_alloc_epc_page() forever and soft lockup will hap
On Thu, Aug 29, 2024 at 03:56:39PM +0800, Huang, Kai wrote:
> Actually run spell check this time ...
>
> On Thu, 2024-08-29 at 10:38 +0800, Aaron Lu wrote:
> > When current node doesn't have a EPC section configured by firmware and
>
> "current node" ->
gx: Add a basic NUMA allocation scheme to
sgx_alloc_epc_page()")
Reported-by: Zhimin Luo
Tested-by: Zhimin Luo
Signed-off-by: Aaron Lu
---
This issue is found by Zhimin when doing internal testing and no
external bug report has been sent out so there is no Closes: tag.
arch/x86/k
On Thu, Aug 27, 2020 at 09:40:22PM -0400, Daniel Jordan wrote:
> I went back to your v1 post to see what motivated you originally, and you had
> some results from aim9 but nothing about where this reared its head in the
> first place. How did you discover the bottleneck? I'm just curious about ho
On Wed, Jul 22, 2020 at 12:23:44AM +, benbjiang(蒋彪) wrote:
>
>
> > +/*
> > + * This function takes care of adjusting the min_vruntime of siblings of
> > + * a core during coresched enable/disable.
> > + * This is called in stop machine context so no need to take the rq lock.
> Hi,
>
> IMHO,
On Tue, Jun 16, 2020 at 10:22:26AM -0700, Peter Oskolkov wrote:
> static void futex_wait_queue_me(struct futex_hash_bucket *hb, struct futex_q
> *q,
> - struct hrtimer_sleeper *timeout)
> + struct hrtimer_sleeper *timeout,
> +
On Tue, Jun 16, 2020 at 10:22:11AM -0700, Peter Oskolkov wrote:
> From 7b091e46de4f9227b5a943e6d78283564e8c1c72 Mon Sep 17 00:00:00 2001
> From: Peter Oskolkov
> Date: Tue, 16 Jun 2020 10:13:58 -0700
> Subject: [RFC PATCH 0/3 v2] futex: introduce FUTEX_SWAP operation
>
> This is an RFC!
>
> As P
On Sat, May 16, 2020 at 11:42:30AM +0800, Aaron Lu wrote:
> On Thu, May 14, 2020 at 03:02:48PM +0200, Peter Zijlstra wrote:
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -4476,6 +4473,16 @@ next_class:;
> > WARN_ON_ONCE(!cookie_
On Thu, May 21, 2020 at 10:35:56PM -0400, Joel Fernandes wrote:
> Discussed a lot with Vineeth. Below is an improved version of the pick_task()
> similification.
>
> It also handles the following "bug" in the existing code as well that Vineeth
> brought up in OSPM: Suppose 2 siblings of a core: rq
On Thu, May 14, 2020 at 03:02:48PM +0200, Peter Zijlstra wrote:
> On Fri, May 08, 2020 at 08:34:57PM +0800, Aaron Lu wrote:
> > With this said, I realized a workaround for the issue described above:
> > when the core went from 'compatible mode'(step 1-3) to 'incompa
On Fri, May 08, 2020 at 11:09:25AM +0200, Peter Zijlstra wrote:
> On Fri, May 08, 2020 at 04:44:19PM +0800, Aaron Lu wrote:
> > On Wed, May 06, 2020 at 04:35:06PM +0200, Peter Zijlstra wrote:
>
> > > Aside from this being way to complicated for what it does -- you
>
On Wed, May 06, 2020 at 04:35:06PM +0200, Peter Zijlstra wrote:
>
> Sorry for being verbose; I've been procrastinating replying, and in
> doing so the things I wanted to say kept growing.
>
> On Fri, Apr 24, 2020 at 10:24:43PM +0800, Aaron Lu wrote:
>
> > To make th
On Sun, Oct 13, 2019 at 08:44:32AM -0400, Vineeth Remanan Pillai wrote:
> On Fri, Oct 11, 2019 at 11:55 PM Aaron Lu wrote:
>
> >
> > I don't think we need do the normalization afterwrads and it appears
> > we are on the same page regarding core wide vruntime.
Shou
On Fri, Oct 11, 2019 at 08:10:30AM -0400, Vineeth Remanan Pillai wrote:
> > Thanks for the clarification.
> >
> > Yes, this is the initialization issue I mentioned before when core
> > scheduling is initially enabled. rq1's vruntime is bumped the first time
> > update_core_cfs_rq_min_vruntime() is
On Fri, Oct 11, 2019 at 07:32:48AM -0400, Vineeth Remanan Pillai wrote:
> > > The reason we need to do this is because, new tasks that gets created will
> > > have a vruntime based on the new min_vruntime and old tasks will have it
> > > based on the old min_vruntime
> >
> > I think this is expecte
On Thu, Oct 10, 2019 at 10:29:47AM -0400, Vineeth Remanan Pillai wrote:
> > I didn't see why we need do this.
> >
> > We only need to have the root level sched entities' vruntime become core
> > wide since we will compare vruntime for them across hyperthreads. For
> > sched entities on sub cfs_rqs,
On Wed, Oct 02, 2019 at 04:48:14PM -0400, Vineeth Remanan Pillai wrote:
> On Mon, Sep 30, 2019 at 7:53 AM Vineeth Remanan Pillai
> wrote:
> >
> > >
> > Sorry, I misunderstood the fix and I did not initially see the core wide
> > min_vruntime that you tried to maintain in the rq->core. This approac
On Fri, Sep 13, 2019 at 07:12:52AM +0800, Aubrey Li wrote:
> On Thu, Sep 12, 2019 at 8:04 PM Aaron Lu wrote:
> >
> > On Wed, Sep 11, 2019 at 09:19:02AM -0700, Tim Chen wrote:
> > > On 9/11/19 7:02 AM, Aaron Lu wrote:
> > > I think Julien's result show
On Thu, Sep 12, 2019 at 10:29:13AM -0700, Tim Chen wrote:
> On 9/12/19 5:35 AM, Aaron Lu wrote:
> > On Wed, Sep 11, 2019 at 12:47:34PM -0400, Vineeth Remanan Pillai wrote:
>
> >
> > core wide vruntime makes sense when there are multiple tasks of
> > different cgroup
On Thu, Sep 12, 2019 at 10:05:43AM -0700, Tim Chen wrote:
> On 9/12/19 5:04 AM, Aaron Lu wrote:
>
> > Well, I have done following tests:
> > 1 Julien's test script: https://paste.debian.net/plainh/834cf45c
> > 2 start two tagged will-it-scale/page_fault1, see how
On Wed, Sep 11, 2019 at 12:47:34PM -0400, Vineeth Remanan Pillai wrote:
> > > So both of you are working on top of my 2 patches that deal with the
> > > fairness issue, but I had the feeling Tim's alternative patches[1] are
> > > simpler than mine and achieves the same result(after the force idle t
On Wed, Sep 11, 2019 at 09:19:02AM -0700, Tim Chen wrote:
> On 9/11/19 7:02 AM, Aaron Lu wrote:
> > Hi Tim & Julien,
> >
> > On Fri, Sep 06, 2019 at 11:30:20AM -0700, Tim Chen wrote:
> >> On 8/7/19 10:10 AM, Tim Chen wrote:
> >>
Hi Tim & Julien,
On Fri, Sep 06, 2019 at 11:30:20AM -0700, Tim Chen wrote:
> On 8/7/19 10:10 AM, Tim Chen wrote:
>
> > 3) Load balancing between CPU cores
> > ---
> > Say if one CPU core's sibling threads get forced idled
> > a lot as it has mostly incompatible tas
On Thu, Aug 15, 2019 at 06:09:28PM +0200, Dario Faggioli wrote:
> On Wed, 2019-08-07 at 10:10 -0700, Tim Chen wrote:
> > On 8/7/19 1:58 AM, Dario Faggioli wrote:
> >
> > > Since I see that, in this thread, there are various patches being
> > > proposed and discussed... should I rerun my benchmarks
On 2019/8/12 23:38, Vineeth Remanan Pillai wrote:
>> I have two other small changes that I think are worth sending out.
>>
>> The first simplify logic in pick_task() and the 2nd avoid task pick all
>> over again when max is preempted. I also refined the previous hack patch to
>> make schedule alway
On Thu, Aug 08, 2019 at 09:39:45AM -0700, Tim Chen wrote:
> On 8/8/19 5:55 AM, Aaron Lu wrote:
> > On Mon, Aug 05, 2019 at 08:55:28AM -0700, Tim Chen wrote:
> >> On 8/2/19 8:37 AM, Julien Desfossez wrote:
> >>> We tested both Aaron's and Tim's patches an
On Thu, Aug 08, 2019 at 02:42:57PM -0700, Tim Chen wrote:
> On 8/8/19 10:27 AM, Tim Chen wrote:
> > On 8/7/19 11:47 PM, Aaron Lu wrote:
> >> On Tue, Aug 06, 2019 at 02:19:57PM -0700, Tim Chen wrote:
> >>> +void account_core_idletime(struct task_struct *p, u64 exec)
On Mon, Aug 05, 2019 at 08:55:28AM -0700, Tim Chen wrote:
> On 8/2/19 8:37 AM, Julien Desfossez wrote:
> > We tested both Aaron's and Tim's patches and here are our results.
> >
> > Test setup:
> > - 2 1-thread sysbench, one running the cpu benchmark, the other one the
> > mem benchmark
> > - bo
On Tue, Aug 06, 2019 at 02:19:57PM -0700, Tim Chen wrote:
> +void account_core_idletime(struct task_struct *p, u64 exec)
> +{
> + const struct cpumask *smt_mask;
> + struct rq *rq;
> + bool force_idle, refill;
> + int i, cpu;
> +
> + rq = task_rq(p);
> + if (!sched_core_enab
On 2019/8/6 22:17, Phil Auld wrote:
> On Tue, Aug 06, 2019 at 09:54:01PM +0800 Aaron Lu wrote:
>> On Mon, Aug 05, 2019 at 04:09:15PM -0400, Phil Auld wrote:
>>> Hi,
>>>
>>> On Fri, Aug 02, 2019 at 11:37:15AM -0400 Julien Desfossez wrote:
>>>> We te
On Mon, Aug 05, 2019 at 04:09:15PM -0400, Phil Auld wrote:
> Hi,
>
> On Fri, Aug 02, 2019 at 11:37:15AM -0400 Julien Desfossez wrote:
> > We tested both Aaron's and Tim's patches and here are our results.
> >
> > Test setup:
> > - 2 1-thread sysbench, one running the cpu benchmark, the other one
On Tue, Aug 06, 2019 at 08:24:17AM -0400, Vineeth Remanan Pillai wrote:
> > >
> > > I also think a way to make fairness per cookie per core, is this what you
> > > want to propose?
> >
> > Yes, that's what I meant.
>
> I think that would hurt some kind of workloads badly, especially if
> one tenan
On 2019/8/6 14:56, Aubrey Li wrote:
> On Tue, Aug 6, 2019 at 11:24 AM Aaron Lu wrote:
>> I've been thinking if we should consider core wide tenent fairness?
>>
>> Let's say there are 3 tasks on 2 threads' rq of the same core, 2 tasks
>> (e.g. A1, A2) b
On Mon, Aug 05, 2019 at 08:55:28AM -0700, Tim Chen wrote:
> On 8/2/19 8:37 AM, Julien Desfossez wrote:
> > We tested both Aaron's and Tim's patches and here are our results.
> >
> > Test setup:
> > - 2 1-thread sysbench, one running the cpu benchmark, the other one the
> > mem benchmark
> > - bo
and if so, do a schedule.
Signed-off-by: Aaron Lu
---
kernel/sched/fair.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 43babc2a12a5..730c9359e9c9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4093,6
The two values can differ greatly and can
cause tasks with a large vruntime starve. So enable core scheduling
early when the system is still kind of idle for the time being to avoid
this problem.
Signed-off-by: Aaron Lu
---
kernel/sched/core.c | 15 ++---
Add a wrapper function cfs_rq_min_vruntime(cfs_rq) to
return cfs_rq->min_vruntime.
It will be used in the following patch, no functionality
change.
Signed-off-by: Aaron Lu
---
kernel/sched/fair.c | 27 ---
1 file changed, 16 insertions(+), 11 deletions(-)
diff --gi
On Mon, Jul 22, 2019 at 06:26:46PM +0800, Aubrey Li wrote:
> The granularity period of util_avg seems too large to decide task priority
> during pick_task(), at least it is in my case, cfs_prio_less() always picked
> core max task, so pick_task() eventually picked idle, which causes this change
> n
On 2019/7/22 18:26, Aubrey Li wrote:
> The granularity period of util_avg seems too large to decide task priority
> during pick_task(), at least it is in my case, cfs_prio_less() always picked
> core max task, so pick_task() eventually picked idle, which causes this change
> not very helpful for my
On Thu, Jul 18, 2019 at 04:27:19PM -0700, Tim Chen wrote:
>
>
> On 7/18/19 3:07 AM, Aaron Lu wrote:
> > On Wed, Jun 19, 2019 at 02:33:02PM -0400, Julien Desfossez wrote:
>
> >
> > With the below patch on top of v3 that makes use of util_avg to decide
> >
On Wed, Jun 19, 2019 at 02:33:02PM -0400, Julien Desfossez wrote:
> On 17-Jun-2019 10:51:27 AM, Aubrey Li wrote:
> > The result looks still unfair, and particularly, the variance is too high,
>
> I just want to confirm that I am also seeing the same issue with a
> similar setup. I also tried with
On Fri, May 31, 2019 at 02:53:21PM +0800, Aubrey Li wrote:
> On Fri, May 31, 2019 at 2:09 PM Aaron Lu wrote:
> >
> > On 2019/5/31 13:12, Aubrey Li wrote:
> > > On Fri, May 31, 2019 at 11:01 AM Aaron Lu
> > > wrote:
> > >>
> > >> This f
On 2019/5/31 13:12, Aubrey Li wrote:
> On Fri, May 31, 2019 at 11:01 AM Aaron Lu wrote:
>>
>> This feels like "date" failed to schedule on some CPU
>> on time.
>>
>> My first reaction is: when shell wakes up from sleep, it will
>> fork date.
On 2019/5/30 22:04, Aubrey Li wrote:
> On Thu, May 30, 2019 at 4:36 AM Vineeth Remanan Pillai
> wrote:
>>
>> Third iteration of the Core-Scheduling feature.
>>
>> This version fixes mostly correctness related issues in v2 and
>> addresses performance issues. Also, addressed some crashes related
>>
On Wed, May 08, 2019 at 01:49:09PM -0400, Julien Desfossez wrote:
> On 08-May-2019 10:30:09 AM, Aaron Lu wrote:
> > On Mon, May 06, 2019 at 03:39:37PM -0400, Julien Desfossez wrote:
> > > On 29-Apr-2019 11:53:21 AM, Aaron Lu wrote:
> > > > This is what I have used
On Mon, May 06, 2019 at 03:39:37PM -0400, Julien Desfossez wrote:
> On 29-Apr-2019 11:53:21 AM, Aaron Lu wrote:
> > This is what I have used to make sure no two unmatched tasks being
> > scheduled on the same core: (on top of v1, I thinks it's easier to just
> > show the
On Tue, Apr 23, 2019 at 04:18:18PM +, Vineeth Remanan Pillai wrote:
> +// XXX fairness/fwd progress conditions
> +static struct task_struct *
> +pick_task(struct rq *rq, const struct sched_class *class, struct task_struct
> *max)
> +{
> + struct task_struct *class_pick, *cookie_pick;
> +
On Tue, Apr 23, 2019 at 04:18:16PM +, Vineeth Remanan Pillai wrote:
> +/*
> + * Find left-most (aka, highest priority) task matching @cookie.
> + */
> +struct task_struct *sched_core_find(struct rq *rq, unsigned long cookie)
> +{
> + struct rb_node *node = rq->core_tree.rb_node;
> + str
On Tue, Apr 23, 2019 at 04:18:14PM +, Vineeth Remanan Pillai wrote:
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index c055bad249a9..45d86b862750 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4132,7 +4132,7 @@ pick_next_entity(struct cfs_rq *cfs_rq, struct
On Tue, Apr 23, 2019 at 06:45:27PM +, Vineeth Remanan Pillai wrote:
> >> - Processes with different tags can still share the core
>
> > I may have missed something... Could you explain this statement?
>
> > This, to me, is the whole point of the patch series. If it's not
> > doing this then .
On Tue, Apr 23, 2019 at 04:18:16PM +, Vineeth Remanan Pillai wrote:
> +/*
> + * l(a,b)
> + * le(a,b) := !l(b,a)
> + * g(a,b) := l(b,a)
> + * ge(a,b) := !l(a,b)
> + */
> +
> +/* real prio, less is less */
> +static inline bool __prio_less(struct task_struct *a, struct task_struct *b,
> bool co
On Tue, Apr 02, 2019 at 10:28:12AM +0200, Peter Zijlstra wrote:
> On Tue, Apr 02, 2019 at 02:46:13PM +0800, Aaron Lu wrote:
...
> > Perhaps we can test if max is on the same cpu as class_pick and then
> > use cpu_prio_less() or core_prio_less() accordingly here, or just
> > r
On Wed, Apr 10, 2019 at 04:44:18PM +0200, Peter Zijlstra wrote:
> On Wed, Apr 10, 2019 at 12:36:33PM +0800, Aaron Lu wrote:
> > On Tue, Apr 09, 2019 at 11:09:45AM -0700, Tim Chen wrote:
> > > Now that we have accumulated quite a number of different fixes to your
> > > o
On Wed, Apr 10, 2019 at 10:18:10PM +0800, Aubrey Li wrote:
> On Wed, Apr 10, 2019 at 12:36 PM Aaron Lu wrote:
> >
> > On Tue, Apr 09, 2019 at 11:09:45AM -0700, Tim Chen wrote:
> > > Now that we have accumulated quite a number of different fixes to your
> > > org
On Tue, Apr 09, 2019 at 11:09:45AM -0700, Tim Chen wrote:
> Now that we have accumulated quite a number of different fixes to your orginal
> posted patches. Would you like to post a v2 of the core scheduler with the
> fixes?
One more question I'm not sure: should a task with cookie=0, i.e. tasks
On Tue, Apr 02, 2019 at 10:28:12AM +0200, Peter Zijlstra wrote:
> Another approach would be something like the below:
>
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -87,7 +87,7 @@ static inline int __task_prio(struct tas
> */
>
> /* real prio, less is less */
> -static inli
On Tue, Apr 02, 2019 at 10:28:12AM +0200, Peter Zijlstra wrote:
> Another approach would be something like the below:
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -87,7 +87,7 @@ static inline int __task_prio(struct tas
> */
>
> /* real prio, less is less */
> -static inline
On Tue, Mar 26, 2019 at 03:32:12PM +0800, Aaron Lu wrote:
> On Fri, Mar 08, 2019 at 11:44:01AM -0800, Subhra Mazumdar wrote:
> >
> > On 2/22/19 4:45 AM, Mel Gorman wrote:
> > >On Mon, Feb 18, 2019 at 09:49:10AM -0800, Linus Torvalds wrote:
> > >>On Mon, Fe
On Fri, Mar 08, 2019 at 11:44:01AM -0800, Subhra Mazumdar wrote:
>
> On 2/22/19 4:45 AM, Mel Gorman wrote:
> >On Mon, Feb 18, 2019 at 09:49:10AM -0800, Linus Torvalds wrote:
> >>On Mon, Feb 18, 2019 at 9:40 AM Peter Zijlstra wrote:
> >>>However; whichever way around you turn this cookie; it is ex
On Mon, Mar 11, 2019 at 05:20:19PM -0700, Greg Kerr wrote:
> On Mon, Mar 11, 2019 at 4:36 PM Subhra Mazumdar
> wrote:
> >
> >
> > On 3/11/19 11:34 AM, Subhra Mazumdar wrote:
> > >
> > > On 3/10/19 9:23 PM, Aubrey Li wrote:
> > >> On Sat, Mar 9, 2019 at 3:50 AM Subhra Mazumdar
> > >> wrote:
> > >>
On 2019/2/11 23:58, Alexander Duyck wrote:
> On Mon, 2019-02-11 at 14:40 +0800, Aaron Lu wrote:
>> On 2019/2/5 2:15, Alexander Duyck wrote:
>>> From: Alexander Duyck
>>>
>>> Because the implementation was limiting itself to only providing hints on
>>
On 2019/2/5 2:15, Alexander Duyck wrote:
> From: Alexander Duyck
>
> Because the implementation was limiting itself to only providing hints on
> pages huge TLB order sized or larger we introduced the possibility for free
> pages to slip past us because they are freed as something less then
> huge
page_frag_free() calls __free_pages_ok() to free the page back to
Buddy. This is OK for high order page, but for order-0 pages, it
misses the optimization opportunity of using Per-Cpu-Pages and can
cause zone lock contention when called frequently.
Paweł Staszewski recently shared his result of 'h
There are multiple places of freeing a page, they all do the same
things so a common function can be used to reduce code duplicate.
It also avoids bug fixed in one function but left in another.
Acked-by: Vlastimil Babka
Signed-off-by: Aaron Lu
---
mm/page_alloc.c | 37
single function to free page
https://lkml.kernel.org/r/20181106113149.gc24...@intel.com
With some changelog rewording.
Applies on top of v4.20-rc2-mmotm-2018-11-16-14-52.
Aaron Lu (2):
mm/page_alloc: free order-0 pages through PCP in page_frag_free()
mm/page_alloc: use a single function to
ruct is still needed in case
nr_node_ids is really big on some systems.
Cc: Vasily Averin
Cc: Michal Hocko
Cc: Huang Ying
Signed-off-by: Aaron Lu
---
include/linux/swap.h | 11 ++-
mm/swapfile.c| 3 ++-
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/include/li
On Fri, Nov 09, 2018 at 08:19:54AM +0800, Rong Chen wrote:
>
>
> On 11/02/2018 04:36 PM, Daniel Borkmann wrote:
> > Hi Rong,
> >
> > On 11/02/2018 03:14 AM, kernel test robot wrote:
> > > Greeting,
> > >
> > > FYI, we noticed a -4.0% regression of will-it-scale.per_process_ops due
> > > to com
On Tue, Nov 06, 2018 at 10:32:00AM +0100, Vlastimil Babka wrote:
> On 11/6/18 9:47 AM, Aaron Lu wrote:
> > On Tue, Nov 06, 2018 at 09:16:20AM +0100, Vlastimil Babka wrote:
> >> On 11/6/18 6:30 AM, Aaron Lu wrote:
> >>> We have multiple places of freeing a pa
to add
> Fixes: a2468cc9bfdf ("swap: choose swap device according to numa node")
>
> because not being able to add a swap space on a fragmented system looks
> like a regression to me.
Agree, especially it used to work.
Regards,
Aaron
> > Acked-by: Aaron Lu
>
I didn't realize
this problem when developing this patch, thanks for pointing this out.
I think using kvzalloc() as is done by your patch is better here as it
can avoid possible failure of swapon.
Acked-by: Aaron Lu
BTW, for systems with few swap devices this may not be a big deal, but
accordin
On Wed, Oct 24, 2018 at 06:01:37PM +0100, Patrick Bellasi wrote:
> On 24-Oct 14:41, Aaron Lu wrote:
> > On Mon, Apr 02, 2018 at 11:20:00AM +0800, Ye, Xiaolong wrote:
> > >
> > > Greeting,
> > >
> > > FYI, we noticed a -9.9% regression of unixbench.sc
On Mon, Apr 02, 2018 at 11:20:00AM +0800, Ye, Xiaolong wrote:
>
> Greeting,
>
> FYI, we noticed a -9.9% regression of unixbench.score due to commit:
>
>
> commit: d519329f72a6f36bc4f2b85452640cfe583b4f81 ("sched/fair: Update
> util_est only on util_avg updates")
> https://git.kernel.org/cgit/l
On Mon, Oct 22, 2018 at 11:37:53AM +0200, Vlastimil Babka wrote:
> On 10/17/18 8:33 AM, Aaron Lu wrote:
> > Profile on Intel Skylake server shows the most time consuming part
> > under zone->lock on allocation path is accessing those to-be-returned
> > page's "st
On Fri, Oct 19, 2018 at 08:00:53AM -0700, Daniel Jordan wrote:
> On Fri, Oct 19, 2018 at 09:54:35AM +0100, Mel Gorman wrote:
> > On Fri, Oct 19, 2018 at 01:57:03PM +0800, Aaron Lu wrote:
> > > >
> > > > I don't think this is the right way of thinking about it
On Thu, Oct 18, 2018 at 12:16:32PM +0100, Mel Gorman wrote:
> On Wed, Oct 17, 2018 at 10:59:04PM +0800, Aaron Lu wrote:
> > > Any particuular reason why? I assume it's related to the number of zone
> > > locks with the increase number of zones and the number of threa
On Thu, Oct 18, 2018 at 12:20:55PM +0100, Mel Gorman wrote:
> On Wed, Oct 17, 2018 at 10:23:27PM +0800, Aaron Lu wrote:
> > > RT has had problems with cpu_relax in the past but more importantly, as
> > > this delay for parallel compactions and allocations of contig ranges,
&g
On Thu, Oct 18, 2018 at 10:23:22AM +0200, Vlastimil Babka wrote:
> On 10/18/18 8:48 AM, Aaron Lu wrote:
> > On Wed, Oct 17, 2018 at 07:03:30PM +0200, Vlastimil Babka wrote:
> >> On 10/17/18 3:58 PM, Mel Gorman wrote:
> >>> Again, as compaction is not guaranteed to
On Wed, Oct 17, 2018 at 07:03:30PM +0200, Vlastimil Babka wrote:
> On 10/17/18 3:58 PM, Mel Gorman wrote:
> > Again, as compaction is not guaranteed to find the pageblocks, it would
> > be important to consider whether a) that matters or b) find an
> > alternative way of keeping unmerged buddies on
On Wed, Oct 17, 2018 at 02:58:07PM +0100, Mel Gorman wrote:
> On Wed, Oct 17, 2018 at 09:10:59PM +0800, Aaron Lu wrote:
> > On Wed, Oct 17, 2018 at 11:44:27AM +0100, Mel Gorman wrote:
> > > On Wed, Oct 17, 2018 at 02:33:27PM +0800, Aaron Lu wrote:
> > > > Running wil
On Wed, Oct 17, 2018 at 12:20:42PM +0100, Mel Gorman wrote:
> On Wed, Oct 17, 2018 at 02:33:28PM +0800, Aaron Lu wrote:
> > Profile on Intel Skylake server shows the most time consuming part
> > under zone->lock on allocation path is accessing those to-be-returned
> > pag
On Wed, Oct 17, 2018 at 11:44:27AM +0100, Mel Gorman wrote:
> On Wed, Oct 17, 2018 at 02:33:27PM +0800, Aaron Lu wrote:
> > Running will-it-scale/page_fault1 process mode workload on a 2 sockets
> > Intel Skylake server showed severe lock contention of zone->lock, as
> >
ree when problem would occur, e.g. when
compaction is in progress.
Signed-off-by: Aaron Lu
---
mm/internal.h | 4
1 file changed, 4 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index fb4e8f7976e5..309a3f43e613 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -538,10 +538,6
There are multiple places that add/remove a page into/from buddy,
introduce helper functions for them.
This also makes it easier to add code when a page is added/removed
to/from buddy.
No functionality change.
Acked-by: Vlastimil Babka
Signed-off-by: Aaron Lu
---
mm/page_alloc.c | 65
es_bulk(), we can avoid calling
add_to_cluster() one time per page but adding them in one go as
a single cluster so this patch just did this.
This optimization brings zone->lock contention down from 25% to
almost zero again using the parallel free workload.
Signed-off-by: Aaron Lu
---
mm/p
could eliminate zone->lock contention entirely but at
the same time, pgdat->lru_lock contention rose to 82%. Final performance
increased about 8.3%.
Suggested-by: Ying Huang
Suggested-by: Dave Hansen
Signed-off-by: Aaron Lu
---
include/linux/mm_types.h | 19 +--
include/linux/mmzon
.
Though performance dropped a little, it almost eliminated zone lock
contention on free path and it is the foundation for the next patch
that eliminates zone lock contention for allocation path.
Suggested-by: Dave Hansen
Signed-off-by: Aaron Lu
---
include/linux/mm_types.h | 9 +++-
mm/compa
//lkml.kernel.org/r/1489568404-7817-1-git-send-email-aaron...@intel.com
A branch is maintained here in case someone wants to give it a try:
https://github.com/aaronlu/linux no_merge_cluster_alloc_4.19-rc5
v4:
- rebased to v4.19-rc5;
- add numbers from netperf(courtesy of Tariq Toukan)
Aaron Lu (
On Fri, Sep 21, 2018 at 10:45:36AM -0700, Daniel Jordan wrote:
> On Tue, Sep 11, 2018 at 01:36:07PM +0800, Aaron Lu wrote:
> > Daniel Jordan and others proposed an innovative technique to make
> > multiple threads concurrently use list_del() at any position of the
> > list a
page and we
lose the merge oppotunity for them. With this patch, we will have
mergable pages unmerged in Buddy.
Due to this, I don't see much value of keeping the range lock which
is used to avoid such thing from happening, so the range lock is
removed in this patch.
Signed-off-by: Aar
For the sake of testing purpose, do not add anon pages to LRU to
avoid LRU lock so we can test zone lock exclusively.
Signed-off-by: Aaron Lu
---
mm/memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/memory.c b/mm/memory.c
index c467102a5cbc..080641255b8b 100644
From: Daniel Jordan
Now that the LRU lock is a RW lock, lay the groundwork for fine-grained
synchronization so that multiple threads holding the lock as reader can
safely remove pages from an LRU at the same time.
Add a thread-safe variant of list_del called smp_list_del that allows
multiple thr
There are multiple places that add/remove a page into/from buddy,
introduce helper functions for them.
This also makes it easier to add code when a page is added/removed
to/from buddy.
No functionality change.
Acked-by: Vlastimil Babka
Signed-off-by: Aaron Lu
---
mm/page_alloc.c | 65
inate zone lock
contention entirely, but has worse fragmentation issue.
[0] https://lwn.net/Articles/753058/
[1] https://lkml.kernel.org/r/20180911004240.4758-1-daniel.m.jor...@oracle.com
[2] https://lkml.kernel.org/r/20180509085450.3524-1-aaron...@intel.com
Aaron Lu (7):
mm: do not add anon pages to LR
From: Daniel Jordan
Now that we splice a local list onto the LRU, prepare for multiple tasks
doing this concurrently by adding a variant of the kernel's list
splicing API, list_splice, that's designed to work with multiple tasks.
Although there is naturally less parallelism to be gained from loc
1 - 100 of 711 matches
Mail list logo