[PATCH v3] sched,fair: skip newidle_balance if a wakeup is pending

2021-04-20 Thread Rik van Riel
and p95 application response time by 10% on average. The schedstats run_delay number shows a similar improvement. Signed-off-by: Rik van Riel --- kernel/sched/fair.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c

Re: [PATCH v2] sched,fair: skip newidle_balance if a wakeup is pending

2021-04-20 Thread Rik van Riel
On Tue, 2021-04-20 at 11:04 +0200, Vincent Guittot wrote: > On Mon, 19 Apr 2021 at 18:51, Rik van Riel wrote: > > > > @@ -10688,7 +10697,7 @@ static int newidle_balance(struct rq > > *this_rq, struct rq_flags *rf) > > if (this_rq->nr_runnin

[PATCH v2] sched,fair: skip newidle_balance if a wakeup is pending

2021-04-19 Thread Rik van Riel
and p95 application response time by 2-3% on average. The schedstats run_delay number shows a similar improvement. Signed-off-by: Rik van Riel --- v2: - fix !SMP build error and prev-not-CFS case by moving check into newidle_balance - fix formatting of if condition - audit newidle_balance

Re: [PATCH] sched,fair: skip newidle_balance if a wakeup is pending

2021-04-19 Thread Rik van Riel
On Mon, 2021-04-19 at 12:22 +0100, Valentin Schneider wrote: > On 18/04/21 22:17, Rik van Riel wrote: > > @@ -10661,7 +10669,8 @@ static int newidle_balance(struct rq > > *this_rq, struct rq_flags *rf) > >* Stop searching for tasks to pull if there are >

[PATCH] sched,fair: skip newidle_balance if a wakeup is pending

2021-04-18 Thread Rik van Riel
and p95 application response time by 2-3% on average. The schedstats run_delay number shows a similar improvement. Signed-off-by: Rik van Riel --- kernel/sched/fair.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index

Re: [PATCH 2/2] sched/fair: Relax task_hot() for misfit tasks

2021-04-15 Thread Rik van Riel
them over to a higher-capacity CPU. > > Align detach_tasks() with the active-balance logic and let it pick a > cache-hot misfit task when the destination CPU can provide a capacity > uplift. > > Signed-off-by: Valentin Schneider Reviewed-by: Rik van Riel This patch looks goo

Re: [PATCH 1/2] sched/fair: Filter out locally-unsolvable misfit imbalances

2021-04-15 Thread Rik van Riel
e. > > Signed-off-by: Valentin Schneider Reviewed-by: Rik van Riel -- All Rights Reversed. signature.asc Description: This is a digitally signed message part

Re: [PATCH v2 00/16] Multigenerational LRU Framework

2021-04-14 Thread Rik van Riel
On Wed, 2021-04-14 at 13:14 -0600, Yu Zhao wrote: > On Wed, Apr 14, 2021 at 9:59 AM Rik van Riel > wrote: > > On Wed, 2021-04-14 at 08:51 -0700, Andi Kleen wrote: > > > >2) It will not scan PTE tables under non-leaf PMD entries > > > > that > > >

Re: [PATCH v2 00/16] Multigenerational LRU Framework

2021-04-14 Thread Rik van Riel
On Wed, 2021-04-14 at 08:51 -0700, Andi Kleen wrote: > >2) It will not scan PTE tables under non-leaf PMD entries that > > do not > > have the accessed bit set, when > > CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG=y. > > This assumes that workloads have reasonable locality. Could there >

Re: [PATCH v2 00/16] Multigenerational LRU Framework

2021-04-14 Thread Rik van Riel
On Wed, 2021-04-14 at 16:27 +0800, Huang, Ying wrote: > Yu Zhao writes: > > > On Wed, Apr 14, 2021 at 12:15 AM Huang, Ying > > wrote: > > > > > NUMA Optimization > > - > > Support NUMA policies and per-node RSS counters. > > > > We only can move forward one step at a time.

Re: [PATCH v2 00/16] Multigenerational LRU Framework

2021-04-13 Thread Rik van Riel
On Wed, 2021-04-14 at 09:14 +1000, Dave Chinner wrote: > On Tue, Apr 13, 2021 at 10:13:24AM -0600, Jens Axboe wrote: > > > The initial posting of this patchset did no better, in fact it did > > a bit > > worse. Performance dropped to the same levels and kswapd was using > > as > > much CPU as

[tip: sched/core] sched/fair: Bring back select_idle_smt(), but differently

2021-04-09 Thread tip-bot2 for Rik van Riel
The following commit has been merged into the sched/core branch of tip: Commit-ID: c722f35b513f807629603bbf24640b1a48be21b5 Gitweb: https://git.kernel.org/tip/c722f35b513f807629603bbf24640b1a48be21b5 Author:Rik van Riel AuthorDate:Fri, 26 Mar 2021 15:19:32 -04:00

[tip: sched/core] sched/fair: Bring back select_idle_smt(), but differently

2021-04-09 Thread tip-bot2 for Rik van Riel
The following commit has been merged into the sched/core branch of tip: Commit-ID: 6bcd3e21ba278098920d26d4888f5e6f4087c61d Gitweb: https://git.kernel.org/tip/6bcd3e21ba278098920d26d4888f5e6f4087c61d Author:Rik van Riel AuthorDate:Fri, 26 Mar 2021 15:19:32 -04:00

Re: [PATCH v3] sched/fair: bring back select_idle_smt, but differently

2021-04-08 Thread Rik van Riel
On Wed, 2021-04-07 at 12:19 +0200, Peter Zijlstra wrote: > On Wed, Apr 07, 2021 at 11:54:37AM +0200, Peter Zijlstra wrote: > > > Let me have another poke at it. > > Pretty much what you did, except I also did s/smt/has_idle_core/ and > fixed that @sd thing. > > Like so then? Looks good to me.

Re: [PATCH v3] sched/fair: bring back select_idle_smt, but differently

2021-04-06 Thread Rik van Riel
On Tue, 2021-04-06 at 17:31 +0200, Vincent Guittot wrote: > On Tue, 6 Apr 2021 at 17:26, Rik van Riel wrote: > > On Tue, 2021-04-06 at 17:10 +0200, Vincent Guittot wrote: > > > On Fri, 26 Mar 2021 at 20:19, Rik van Riel > > > wrote: > > > > > > >

Re: [PATCH v3] sched/fair: bring back select_idle_smt, but differently

2021-04-06 Thread Rik van Riel
On Tue, 2021-04-06 at 17:10 +0200, Vincent Guittot wrote: > On Fri, 26 Mar 2021 at 20:19, Rik van Riel wrote: > > > -static int select_idle_cpu(struct task_struct *p, struct > > sched_domain *sd, int target) > > +static int select_idle_cpu(struct task_struct *p, struct &

[PATCH v3] sched/fair: bring back select_idle_smt, but differently

2021-03-26 Thread Rik van Riel
p99 response times for the memcache type application improve by about 10% over what they were before Mel's patches got merged. Signed-off-by: Rik van Riel --- kernel/sched/fair.c | 68 ++--- 1 file changed, 52 insertions(+), 16 deletions(-) diff --git a/kernel/

Re: [PATCH v2] sched/fair: bring back select_idle_smt, but differently

2021-03-22 Thread Rik van Riel
On Mon, 2021-03-22 at 15:33 +, Mel Gorman wrote: > If trying that, I would put that in a separate patch. At one point > I did play with clearing prev, target and recent but hit problems. > Initialising the mask and clearing them in select_idle_sibling() hurt > the fast path and doing it later

Re: [PATCH v2] sched/fair: bring back select_idle_smt, but differently

2021-03-22 Thread Rik van Riel
On Mon, 2021-03-22 at 11:03 +, Mel Gorman wrote: > On Sun, Mar 21, 2021 at 03:03:58PM -0400, Rik van Riel wrote: > > Mel Gorman did some nice work in 9fe1f127b913 > > ("sched/fair: Merge select_idle_core/cpu()"), resulting in the > > kernel > >

[PATCH v2] sched/fair: bring back select_idle_smt, but differently

2021-03-21 Thread Rik van Riel
and the continuous 2% CPU use regression on the memcache type workload. With Mel's patches and this patch together, the p95 and p99 response times for the memcache type application improve by about 20% over what they were before Mel's patches got merged. Signed-off-by:

Re: [PATCH] sched/fair: bring back select_idle_smt, but differently

2021-03-21 Thread Rik van Riel
On Sun, 2021-03-21 at 14:48 -0400, Rik van Riel wrote: > > + if (cpus_share_cache(prev, target)) { > + /* No idle core. Check if prev has an idle sibling. */ > + i = select_idle_smt(p, sd, prev); Uh, one minute. This is the wrong version of the pat

[PATCH] sched/fair: bring back select_idle_smt, but differently

2021-03-21 Thread Rik van Riel
and the continuous 2% CPU use regression on the memcache type workload. With Mel's patches and this patch together, the p95 and p99 response times for the memcache type application improve by about 20% over what they were before Mel's patches got merged. Signed-off-by:

Re: [PATCH v1 09/14] mm: multigenerational lru: mm_struct list

2021-03-15 Thread Rik van Riel
On Sat, 2021-03-13 at 00:57 -0700, Yu Zhao wrote: > +/* > + * After pages are faulted in, they become the youngest generation. > They must > + * go through aging process twice before they can be evicted. After > first scan, > + * their accessed bit set during initial faults are cleared and they >

Re: [PATCH] sched/fair: Prefer idle CPU to cache affinity

2021-02-27 Thread Rik van Riel
On Fri, 2021-02-26 at 22:10 +0530, Srikar Dronamraju wrote: > Current order of preference to pick a LLC while waking a wake-affine > task: > 1. Between the waker CPU and previous CPU, prefer the LLC of the CPU >that is idle. > > 2. Between the waker CPU and previous CPU, prefer the LLC of

[PATCH 4/3] mm,shmem,thp: limit shmem THP allocations to requested zones

2021-02-24 Thread Rik van Riel
On Wed, 24 Feb 2021 08:55:40 -0800 (PST) Hugh Dickins wrote: > On Wed, 24 Feb 2021, Rik van Riel wrote: > > On Wed, 2021-02-24 at 00:41 -0800, Hugh Dickins wrote: > > > Oh, I'd forgotten all about that gma500 aspect: > > > well, I can send a fixup later on. > >

Re: [PATCH v6 0/3] mm,thp,shm: limit shmem THP alloc gfp_mask

2021-02-24 Thread Rik van Riel
On Wed, 2021-02-24 at 00:41 -0800, Hugh Dickins wrote: > On Mon, 14 Dec 2020, Vlastimil Babka wrote: > > > > (There's also a specific issue with the gfp_mask limiting: I have > > > not yet reviewed the allowing and denying in detail, but it looks > > > like it does not respect the caller's

Re: [PATCH RFC clocksource 2/5] clocksource: Retry clock read if long delays detected

2021-01-06 Thread Rik van Riel
On Wed, 2021-01-06 at 11:53 -0800, Paul E. McKenney wrote: > On Wed, Jan 06, 2021 at 11:28:00AM -0500, Rik van Riel wrote: > > > + wdagain_nsec = clocksource_cyc2ns(delta, watchdog- > > > mult, watchdog->shift); > > + if (wd

Re: [PATCH RFC clocksource 2/5] clocksource: Retry clock read if long delays detected

2021-01-06 Thread Rik van Riel
On Tue, 2021-01-05 at 16:41 -0800, paul...@kernel.org wrote: > > @@ -203,7 +204,6 @@ static void > clocksource_watchdog_inject_delay(void) > injectfail = inject_delay_run; > if (!(++injectfail / inject_delay_run % inject_delay_freq)) { > printk("%s(): Injecting

Re: [MOCKUP] x86/mm: Lightweight lazy mm refcounting

2020-12-03 Thread Rik van Riel
On Thu, 2020-12-03 at 12:31 +, Matthew Wilcox wrote: > And this just makes me think RCU freeing of mm_struct. I'm sure it's > more complicated than that (then, or now), but if an anonymous > process > is borrowing a freed mm, and the mm is freed by RCU then it will not > go > away until the

Re: [PATCH 2/3] mm,thp,shm: limit gfp mask to no more than specified

2020-11-30 Thread Rik van Riel
On Mon, 2020-11-30 at 11:00 +0100, Michal Hocko wrote: > On Fri 27-11-20 14:03:39, Rik van Riel wrote: > > On Fri, 2020-11-27 at 08:52 +0100, Michal Hocko wrote: > > > On Thu 26-11-20 13:04:14, Rik van Riel wrote: > > > > I would be more than happy to implement things

Re: [PATCH 2/3] mm,thp,shm: limit gfp mask to no more than specified

2020-11-27 Thread Rik van Riel
On Fri, 2020-11-27 at 08:52 +0100, Michal Hocko wrote: > On Thu 26-11-20 13:04:14, Rik van Riel wrote: > > > > I would be more than happy to implement things differently, > > but I am not sure what alternative you are suggesting. > > Simply do not alter gfp f

Re: [PATCH 3/3] mm,thp,shmem: make khugepaged obey tmpfs mount flags

2020-11-26 Thread Rik van Riel
On Thu, 2020-11-26 at 20:42 +0100, Vlastimil Babka wrote: > On 11/26/20 7:14 PM, Rik van Riel wrote: > > On Thu, 2020-11-26 at 18:18 +0100, Vlastimil Babka wrote: > > > > > This patch makes khugepaged treat the mount options > > and/or > > sysfs flag as enabl

Re: [PATCH 3/3] mm,thp,shmem: make khugepaged obey tmpfs mount flags

2020-11-26 Thread Rik van Riel
On Thu, 2020-11-26 at 18:18 +0100, Vlastimil Babka wrote: > On 11/24/20 8:49 PM, Rik van Riel wrote: > > Currently if thp enabled=[madvise], mounting a tmpfs filesystem > > with huge=always and mmapping files from that tmpfs does not > > result in khugepaged collapsing th

Re: [PATCH 2/3] mm,thp,shm: limit gfp mask to no more than specified

2020-11-26 Thread Rik van Riel
On Thu, 2020-11-26 at 14:40 +0100, Michal Hocko wrote: > On Tue 24-11-20 14:49:24, Rik van Riel wrote: > > Matthew Wilcox pointed out that the i915 driver opportunistically > > allocates tmpfs memory, but will happily reclaim some of its > > pool if no memory is availabl

[PATCH 1/3] mm,thp,shmem: limit shmem THP alloc gfp_mask

2020-11-24 Thread Rik van Riel
ay for files mmapped with MADV_HUGEPAGE, and a little less aggressive for files that are not mmapped or mapped without that flag. Signed-off-by: Rik van Riel --- include/linux/gfp.h | 2 ++ mm/huge_memory.c| 6 +++--- mm/shmem.c | 8 +--- 3 files changed, 10 insertions(+), 6 deletions(-)

[PATCH 2/3] mm,thp,shm: limit gfp mask to no more than specified

2020-11-24 Thread Rik van Riel
-by: Rik van Riel Suggested-by: Matthew Wilcox --- mm/shmem.c | 21 + 1 file changed, 21 insertions(+) diff --git a/mm/shmem.c b/mm/shmem.c index 6c3cb192a88d..ee3cea10c2a4 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1531,6 +1531,26 @@ static struct page *shmem_swapin

[PATCH 3/3] mm,thp,shmem: make khugepaged obey tmpfs mount flags

2020-11-24 Thread Rik van Riel
bit, and testing things in the correct order. Signed-off-by: Rik van Riel Fixes: c2231020ea7b ("mm: thp: register mm for khugepaged when merging vma for shmem") --- include/linux/khugepaged.h | 2 ++ mm/khugepaged.c| 22 -- 2 files changed, 18 insert

[PATCH v6 0/3] mm,thp,shm: limit shmem THP alloc gfp_mask

2020-11-24 Thread Rik van Riel
The allocation flags of anonymous transparent huge pages can be controlled through the files in /sys/kernel/mm/transparent_hugepage/defrag, which can help the system from getting bogged down in the page reclaim and compaction code when many THPs are getting allocated simultaneously. However, the

Re: [PATCH 2/2] mm,thp,shm: limit gfp mask to no more than specified

2020-11-23 Thread Rik van Riel
On Thu, 2020-11-19 at 10:38 +0100, Michal Hocko wrote: > On Fri 13-11-20 22:40:40, Rik van Riel wrote: > > On Thu, 2020-11-12 at 12:22 +0100, Michal Hocko wrote: > > > [Cc Chris for i915 and Andray] > > > > > > On Thu 05-11-20 14:15:08, Rik van Riel wrote

Re: [PATCH 1/2] mm,thp,shmem: limit shmem THP alloc gfp_mask

2020-11-13 Thread Rik van Riel
On Thu, 2020-11-12 at 11:52 +0100, Michal Hocko wrote: > On Thu 05-11-20 14:15:07, Rik van Riel wrote: > > > > This patch applies the same configurated limitation of THPs to > > shmem > > hugepage allocations, to prevent that from happening. > > I believe you

Re: [PATCH 2/2] mm,thp,shm: limit gfp mask to no more than specified

2020-11-13 Thread Rik van Riel
On Thu, 2020-11-12 at 12:22 +0100, Michal Hocko wrote: > [Cc Chris for i915 and Andray] > > On Thu 05-11-20 14:15:08, Rik van Riel wrote: > > Matthew Wilcox pointed out that the i915 driver opportunistically > > allocates tmpfs memory, but will happily reclaim some of its

Re: [PATCH 2/2] mm,thp,shm: limit gfp mask to no more than specified

2020-11-06 Thread Rik van Riel
> > Make sure the gfp mask used to opportunistically allocate a THP > > is always at least as restrictive as the original gfp mask. > > > > Signed-off-by: Rik van Riel > > Suggested-by: Matthew Wilcox > > --- > > mm/shmem.c | 21 + > &

[PATCH 1/2] mm,thp,shmem: limit shmem THP alloc gfp_mask

2020-11-05 Thread Rik van Riel
ttle more aggressive than today for files mmapped with MADV_HUGEPAGE, and a little less aggressive for files that are not mmapped or mapped without that flag. Signed-off-by: Rik van Riel --- include/linux/gfp.h | 2 ++ mm/huge_memory.c| 6 +++--- mm/shmem.c | 8 +--- 3 files changed,

[PATCH 2/2] mm,thp,shm: limit gfp mask to no more than specified

2020-11-05 Thread Rik van Riel
-by: Rik van Riel Suggested-by: Matthew Wilcox --- mm/shmem.c | 21 + 1 file changed, 21 insertions(+) diff --git a/mm/shmem.c b/mm/shmem.c index 6c3cb192a88d..ee3cea10c2a4 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1531,6 +1531,26 @@ static struct page *shmem_swapin

[PATCH 0/2] mm,thp,shm: limit shmem THP alloc gfp_mask

2020-11-05 Thread Rik van Riel
The allocation flags of anonymous transparent huge pages can be controlled through the files in /sys/kernel/mm/transparent_hugepage/defrag, which can help the system from getting bogged down in the page reclaim and compaction code when many THPs are getting allocated simultaneously. However, the

Re: [PATCH] sched/fair: ensure tasks spreading in LLC during LB

2020-11-02 Thread Rik van Riel
On Mon, 2020-11-02 at 11:24 +0100, Vincent Guittot wrote: > Fixes: 0b0695f2b34a ("sched/fair: Rework load_balance()") > Reported-by: Chris Mason > Suggested-by: Rik van Riel > Signed-off-by: Vincent Guittot Tested-and-reviewed-by: Rik van Riel Thank you! -

Re: [PATCH] fix scheduler regression from "sched/fair: Rework load_balance()"

2020-10-29 Thread Rik van Riel
On Mon, 2020-10-26 at 17:52 +0100, Vincent Guittot wrote: > On Mon, 26 Oct 2020 at 17:48, Chris Mason wrote: > > On 26 Oct 2020, at 12:20, Vincent Guittot wrote: > > > > > what you are suggesting is something like: > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > index

Re: [PATCH] fix scheduler regression from "sched/fair: Rework load_balance()"

2020-10-26 Thread Rik van Riel
On Mon, 26 Oct 2020 16:42:14 +0100 Vincent Guittot wrote: > On Mon, 26 Oct 2020 at 16:04, Rik van Riel wrote: > > Could utilization estimates be off, either lagging or > > simply having a wrong estimate for a task, resulting > > in no task getting pulled s

Re: [PATCH] fix scheduler regression from "sched/fair: Rework load_balance()"

2020-10-26 Thread Rik van Riel
On Mon, 2020-10-26 at 15:56 +0100, Vincent Guittot wrote: > On Mon, 26 Oct 2020 at 15:38, Rik van Riel wrote: > > On Mon, 2020-10-26 at 15:24 +0100, Vincent Guittot wrote: > > > Le lundi 26 oct. 2020 à 08:45:27 (-0400), Chris Mason a écrit : > > > > On 26 Oct 2020,

Re: [PATCH] fix scheduler regression from "sched/fair: Rework load_balance()"

2020-10-26 Thread Rik van Riel
On Mon, 2020-10-26 at 15:24 +0100, Vincent Guittot wrote: > Le lundi 26 oct. 2020 à 08:45:27 (-0400), Chris Mason a écrit : > > On 26 Oct 2020, at 4:39, Vincent Guittot wrote: > > > > > Hi Chris > > > > > > On Sat, 24 Oct 2020 at 01:49, Chris Mason wrote: > > > > Hi everyone, > > > > > > > >

Re: [PATCH v4] mm,thp,shmem: limit shmem THP alloc gfp_mask

2020-10-24 Thread Rik van Riel
On Sat, 2020-10-24 at 03:09 +0100, Matthew Wilcox wrote: > On Fri, Oct 23, 2020 at 08:48:04PM -0400, Rik van Riel wrote: > > The allocation flags of anonymous transparent huge pages can be > > controlled > > through the files in /sys/kernel/mm/transparent_hugepage/defrag, &

[PATCH v4] mm,thp,shmem: limit shmem THP alloc gfp_mask

2020-10-23 Thread Rik van Riel
ttle more aggressive than today for files mmapped with MADV_HUGEPAGE, and a little less aggressive for files that are not mmapped or mapped without that flag. Signed-off-by: Rik van Riel --- v4: rename alloc_hugepage_direct_gfpmask to vma_thp_gfp_mask (Matthew Wilcox) v3: fix NULL vma issue spo

[PATCH v3] mm,thp,shmem: limit shmem THP alloc gfp_mask

2020-10-23 Thread Rik van Riel
ttle more aggressive than today for files mmapped with MADV_HUGEPAGE, and a little less aggressive for files that are not mmapped or mapped without that flag. Signed-off-by: Rik van Riel --- v3: fix NULL vma issue spotted by Hugh Dickins & tested v2: move gfp calculation to shmem_getpage_gfp as

Re: [PATCH v2] mm,thp,shmem: limit shmem THP alloc gfp_mask

2020-10-22 Thread Rik van Riel
On Thu, 2020-10-22 at 19:54 -0700, Hugh Dickins wrote: > On Thu, 22 Oct 2020, Rik van Riel wrote: > > > The allocation flags of anonymous transparent huge pages can be > controlled > > through the files in /sys/kernel/mm/transparent_hugepage/defrag, > which can > >

Re: [PATCH] mm: memcontrol: add file_thp, shmem_thp to memory.stat

2020-10-22 Thread Rik van Riel
On Thu, 2020-10-22 at 12:49 -0400, Rik van Riel wrote: > On Thu, 2020-10-22 at 11:18 -0400, Johannes Weiner wrote: > > > index e80aa9d2db68..334ce608735c 100644 > > --- a/mm/filemap.c > > +++ b/mm/filemap.c > > @@ -204,9 +204,9 @@ static void unaccount_page_cache

Re: [PATCH] mm: memcontrol: add file_thp, shmem_thp to memory.stat

2020-10-22 Thread Rik van Riel
On Thu, 2020-10-22 at 11:18 -0400, Johannes Weiner wrote: > index e80aa9d2db68..334ce608735c 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -204,9 +204,9 @@ static void unaccount_page_cache_page(struct > address_space *mapping, > if (PageSwapBacked(page)) { >

[PATCH v2] mm,thp,shmem: limit shmem THP alloc gfp_mask

2020-10-22 Thread Rik van Riel
hugepage allocations, to prevent that from happening. This way a THP defrag setting of "never" or "defer+madvise" will result in quick allocation failures without direct reclaim when no 2MB free pages are available. Signed-off-by: Rik van Riel --- v2: move gfp calculation

Re: [PATCH] mm,thp,shmem: limit shmem THP alloc gfp_mask

2020-10-22 Thread Rik van Riel
On Fri, 2020-10-23 at 00:00 +0800, Yu Xu wrote: > On 10/22/20 11:48 AM, Rik van Riel wrote: > > The allocation flags of anonymous transparent huge pages can be > > controlled > > through the files in /sys/kernel/mm/transparent_hugepage/defrag, > > which can > > he

Re: [PATCH] mm,thp,shmem: limit shmem THP alloc gfp_mask

2020-10-22 Thread Rik van Riel
On Thu, 2020-10-22 at 17:50 +0200, Michal Hocko wrote: > On Thu 22-10-20 09:25:21, Rik van Riel wrote: > > On Thu, 2020-10-22 at 10:15 +0200, Michal Hocko wrote: > > > On Wed 21-10-20 23:48:46, Rik van Riel wrote: > > > > > > > > diff --git a/mm/shmem

Re: [PATCH] mm/shmem: fix up gfpmask for shmem hugepage allocation

2020-10-22 Thread Rik van Riel
On Wed, 2020-10-21 at 16:09 +0800, Xu Yu wrote: > @@ -1887,6 +1930,7 @@ static int shmem_getpage_gfp(struct inode > *inode, pgoff_t index, > } > > alloc_huge: > + gfp = shmem_hugepage_gfpmask_fixup(gfp, sgp_huge); > page = shmem_alloc_and_acct_page(gfp, inode, index, true); >

Re: [PATCH] mm,thp,shmem: limit shmem THP alloc gfp_mask

2020-10-22 Thread Rik van Riel
On Thu, 2020-10-22 at 10:15 +0200, Michal Hocko wrote: > On Wed 21-10-20 23:48:46, Rik van Riel wrote: > > The allocation flags of anonymous transparent huge pages can be > > controlled > > through the files in /sys/kernel/mm/transparent_hugepage/defrag, > > which c

[PATCH] mm,thp,shmem: limit shmem THP alloc gfp_mask

2020-10-21 Thread Rik van Riel
hugepage allocations, to prevent that from happening. This way a THP defrag setting of "never" or "defer+madvise" will result in quick allocation failures without direct reclaim when no 2MB free pages are available. Signed-off-by: Rik van Riel --- diff --git a/include/linux/g

Re: [PATCH 0/2] mm,swap: skip swap readahead for instant IO (like zswap)

2020-10-09 Thread Rik van Riel
On Mon, 2020-10-05 at 13:32 -0400, Rik van Riel wrote: > On Tue, 2020-09-22 at 10:12 -0700, Andrew Morton wrote: > > On Mon, 21 Sep 2020 22:01:46 -0400 Rik van Riel > > wrote: > > Any quantitative testing results? > > I have test results with a real workload n

Re: [PATCH 0/2] mm,swap: skip swap readahead for instant IO (like zswap)

2020-10-05 Thread Rik van Riel
On Tue, 2020-09-22 at 10:12 -0700, Andrew Morton wrote: > On Mon, 21 Sep 2020 22:01:46 -0400 Rik van Riel > wrote: > > > Both with frontswap/zswap, and with some extremely fast IO devices, > > swap IO will be done before the "asynchronous" swap_r

Re: [RFC PATCH 1/1] vmscan: Support multiple kswapd threads per node

2020-10-02 Thread Rik van Riel
On Fri, 2020-10-02 at 09:03 +0200, Michal Hocko wrote: > On Thu 01-10-20 18:18:10, Sebastiaan Meijer wrote: > > (Apologies for messing up the mailing list thread, Gmail had fooled > > me into > > believing that it properly picked up the thread) > > > > On Thu, 1 Oct 2020 at 14:30, Michal Hocko

Re: [PATCH 2/2] mm,swap: skip swap readahead if page was obtained instantaneously

2020-09-22 Thread Rik van Riel
On Tue, 2020-09-22 at 11:13 +0800, huang ying wrote: > On Tue, Sep 22, 2020 at 10:02 AM Rik van Riel > wrote: > > Check whether a swap page was obtained instantaneously, for example > > because it is in zswap, or on a very fast IO device which uses busy > > waiting, a

[PATCH 2/2] mm,swap: skip swap readahead if page was obtained instantaneously

2020-09-21 Thread Rik van Riel
is likely to be counterproductive, because the extra loads will cause additional latency, use up extra memory, and chances are the surrounding pages in swap are just as fast to load as this one, making readahead pointless. Signed-off-by: Rik van Riel --- mm/swap_state.c | 14 +++--- 1 file

[PATCH 1/2] mm,swap: extract swap single page readahead into its own function

2020-09-21 Thread Rik van Riel
Split swap single page readahead into its own function, to make the next patch easier to read. No functional changes. Signed-off-by: Rik van Riel --- mm/swap_state.c | 40 +--- 1 file changed, 25 insertions(+), 15 deletions(-) diff --git a/mm/swap_state.c b

[PATCH 0/2] mm,swap: skip swap readahead for instant IO (like zswap)

2020-09-21 Thread Rik van Riel
Both with frontswap/zswap, and with some extremely fast IO devices, swap IO will be done before the "asynchronous" swap_readpage() call has returned. In that case, doing swap readahead only wastes memory, increases latency, and increases the chances of needing to evict something more useful from

Re: [PATCH 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API

2020-09-16 Thread Rik van Riel
On Wed, 2020-09-16 at 15:18 -0400, Nick Terrell wrote: > The zstd version in the kernel works fine. But, you can see that the > version > that got imported stagnated where upstream had 14 released versions. > I > don't think it makes sense to have kernel developers maintain their > own copy > of

[PATCH] silence nfscache allocation warnings with kvzalloc

2020-09-14 Thread Rik van Riel
enough systems. Switching to kvzalloc gets rid of the allocation warnings, and makes the code a little cleaner too as a side effect. Freeing of nn->drc_hashtbl is already done using kvfree currently. Signed-off-by: Rik van Riel --- diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c in

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-10 Thread Rik van Riel
On Thu, 2020-09-10 at 09:32 +0200, Michal Hocko wrote: > [Cc Vlastimil and Mel - the whole email thread starts > http://lkml.kernel.org/r/20200902180628.4052244-1-zi@sent.com > but this particular subthread has diverged a bit and you might find > it > interesting] > > On Wed 09-09-20

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-09 Thread Rik van Riel
On Wed, 2020-09-09 at 15:43 +0200, David Hildenbrand wrote: > On 09.09.20 15:19, Rik van Riel wrote: > > On Wed, 2020-09-09 at 09:04 +0200, Michal Hocko wrote: > > > > > That CMA has to be pre-reserved, right? That requires a > > > configuration. > > &g

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-09 Thread Rik van Riel
On Wed, 2020-09-09 at 09:04 +0200, Michal Hocko wrote: > On Tue 08-09-20 10:41:10, Rik van Riel wrote: > > On Tue, 2020-09-08 at 16:35 +0200, Michal Hocko wrote: > > > > > A global knob is insufficient. 1G pages will become a very > > > precious > > >

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-08 Thread Rik van Riel
On Tue, 2020-09-08 at 16:35 +0200, Michal Hocko wrote: > A global knob is insufficient. 1G pages will become a very precious > resource as it requires a pre-allocation (reservation). So it really > has > to be an opt-in and the question is whether there is also some sort > of > access control

Re: CFS flat runqueue proposal fixes/update

2020-08-20 Thread Rik van Riel
On Thu, 2020-08-20 at 16:56 +0200, Dietmar Eggemann wrote: > Hi Rik, > > On 31/07/2020 09:42, Rik van Riel wrote: > > [...] > > > Lets revisit the hierarchy from above, and assign priorities > > to the cgroups, with the fixed point one being 1000. Lets > > sa

Re: [PATCH for 5.9 v2 1/4] futex: introduce FUTEX_SWAP operation

2020-08-11 Thread Rik van Riel
On Tue, 2020-08-04 at 14:31 +0200, pet...@infradead.org wrote: > On Mon, Aug 03, 2020 at 03:15:07PM -0700, Peter Oskolkov wrote: > > A simplified/idealized use case: imagine a multi-user service > > application > > (e.g. a DBMS) that has to implement the following user CPU quota > > policy: > >

Re: CFS flat runqueue proposal fixes/update

2020-08-07 Thread Rik van Riel
On Fri, 2020-08-07 at 16:14 +0200, Dietmar Eggemann wrote: > On 31/07/2020 09:42, Rik van Riel wrote: > > Possible solution > > ... > I imagine that I can see what you want to achieve here ;-) > > But it's hard since your v5 RFC > https://lkml.kernel.org

CFS flat runqueue proposal fixes/update

2020-07-31 Thread Rik van Riel
Hello, last year at Linux Plumbers conference, I presented on my work of turning the hierarchical CFS runqueue into a flat runqueue, and Paul Turner pointed out some corner cases that could not work with my design as it was last year. Paul pointed out two corner cases, and I have come up with a

Re: XHCI vs PCM2903B/PCM2904 part 2

2020-06-30 Thread Rik van Riel
On Tue, 2020-06-30 at 17:27 +0300, Mathias Nyman wrote: > On 30.6.2020 16.08, Rik van Riel wrote: > > I misread the code, it's not a bitfield, so state 1 means an > > endpoint marked with running state. The next urb is never getting a > > response, though. > > >

Re: XHCI vs PCM2903B/PCM2904 part 2

2020-06-29 Thread Rik van Riel
On Mon, 2020-06-29 at 23:21 -0400, Rik van Riel wrote: > > Could you add the code below and take new traces, it will show the > > endpoint > > state after the Babble error. > > Hi Mathias, > > I have finally rebooted into a kernel with your tracepoint.

Re: XHCI vs PCM2903B/PCM2904 part 2

2020-06-29 Thread Rik van Riel
[keeping old context since it's been a month...] On Mon, 2020-05-25 at 12:37 +0300, Mathias Nyman wrote: > On 21.5.2020 6.45, Rik van Riel wrote: > > On Wed, 2020-05-20 at 16:34 -0400, Alan Stern wrote: > > > On Wed, May 20, 2020 at 03:21:44PM -0400, Rik van Riel wrote: &

Re: XHCI vs PCM2903B/PCM2904 part 2

2020-05-20 Thread Rik van Riel
On Wed, 2020-05-20 at 16:34 -0400, Alan Stern wrote: > On Wed, May 20, 2020 at 03:21:44PM -0400, Rik van Riel wrote: > > > > Interesting. That makes me really curious why things are > > getting stuck, now... > > This could be a bug in xhci-hcd. Perhaps the controlle

Re: XHCI vs PCM2903B/PCM2904 part 2

2020-05-20 Thread Rik van Riel
On Wed, 2020-05-20 at 16:50 +0300, Mathias Nyman wrote: > On 20.5.2020 14.26, Rik van Riel wrote: > > After a few more weeks of digging, I have come to the tentative > > conclusion that either the XHCI driver, or the USB sound driver, > > or both, fail to handle USB errors

Re: XHCI vs PCM2903B/PCM2904 part 2

2020-05-20 Thread Rik van Riel
On Wed, 2020-05-20 at 12:38 -0400, Alan Stern wrote: > On Wed, May 20, 2020 at 07:26:57AM -0400, Rik van Riel wrote: > > After a few more weeks of digging, I have come to the tentative > > conclusion that either the XHCI driver, or the USB sound driver, > > or both, fail

XHCI vs PCM2903B/PCM2904 part 2

2020-05-20 Thread Rik van Riel
After a few more weeks of digging, I have come to the tentative conclusion that either the XHCI driver, or the USB sound driver, or both, fail to handle USB errors correctly. I have some questions at the bottom, after a (brief-ish) explanation of exactly what seems to go wrong. TL;DR: arecord

Re: Possibility of conflicting memory types in lazier TLB mode?

2020-05-15 Thread Rik van Riel
On Fri, 2020-05-15 at 16:50 +1000, Nicholas Piggin wrote: > > But what about if there are (real, not speculative) stores in the > store > queue still on the lazy thread from when it was switched, that have > not > yet become coherent? The page is freed by another CPU and reallocated > for

Re: [PATCH] mm,thp: recheck each page before collapsing file THP

2019-10-18 Thread Rik van Riel
On Fri, 2019-10-18 at 16:34 +0300, Kirill A. Shutemov wrote: > On Thu, Oct 17, 2019 at 10:08:32PM -0700, Song Liu wrote: > > In collapse_file(), after locking the page, it is necessary to > > recheck > > that the page is up-to-date, clean, and pointing to the proper > > mapping. > > If any check

Re: [PATCH] mm,thp: recheck each page before collapsing file THP

2019-10-18 Thread Rik van Riel
;mm,thp: add read-only THP support for (non- > shmem) FS") > Cc: Kirill A. Shutemov > Cc: Johannes Weiner > Cc: Hugh Dickins > Cc: William Kucharski > Cc: Andrew Morton > Signed-off-by: Song Liu Acked-by: Rik van Riel

Re: [PATCH v3 09/10] sched/fair: use load instead of runnable load in wakeup path

2019-10-07 Thread Rik van Riel
On Mon, 2019-10-07 at 17:27 +0200, Vincent Guittot wrote: > On Mon, 7 Oct 2019 at 17:14, Rik van Riel wrote: > > On Thu, 2019-09-19 at 09:33 +0200, Vincent Guittot wrote: > > > runnable load has been introduced to take into account the case > > > where > > >

Re: [PATCH v3 09/10] sched/fair: use load instead of runnable load in wakeup path

2019-10-07 Thread Rik van Riel
On Thu, 2019-09-19 at 09:33 +0200, Vincent Guittot wrote: > runnable load has been introduced to take into account the case where > blocked load biases the wake up path which may end to select an > overloaded > CPU with a large number of runnable tasks instead of an underutilized > CPU with a huge

Re: [PATCH] mm/rmap.c: reuse mergeable anon_vma as parent when fork

2019-10-04 Thread Rik van Riel
e mergeable anon_vmas, we can just reuse it and > not > necessary to go through the logic. > > After this change, kernel build test reduces 20% anon_vma allocation. > > Signed-off-by: Wei Yang Acked-by: Rik van Riel -- All Rights Reversed. signature.asc Description: This is a digitally signed message part

Re: [PATCH v3 04/10] sched/fair: rework load_balance

2019-09-29 Thread Rik van Riel
On Thu, 2019-09-19 at 09:33 +0200, Vincent Guittot wrote: > > Also the load balance decisions have been consolidated in the 3 > functions > below after removing the few bypasses and hacks of the current code: > - update_sd_pick_busiest() select the busiest sched_group. > - find_busiest_group()

Re: [PATCH v3 03/10] sched/fair: remove meaningless imbalance calculation

2019-09-27 Thread Rik van Riel
On Thu, 2019-09-19 at 09:33 +0200, Vincent Guittot wrote: > clean up load_balance and remove meaningless calculation and fields > before > adding new algorithm. > > Signed-off-by: Vincent Guittot Yay. Acked-by: Rik van Riel -- All Rights Reversed. signature.

Re: [PATCH v3 02/10] sched/fair: rename sum_nr_running to sum_h_nr_running

2019-09-27 Thread Rik van Riel
t; Signed-off-by: Vincent Guittot > Acked-by: Rik van Riel -- All Rights Reversed. signature.asc Description: This is a digitally signed message part

Re: [PATCH v3 01/10] sched/fair: clean up asym packing

2019-09-27 Thread Rik van Riel
don't need to test twice same conditions anymore to detect asym > packing > and we consolidate the calculation of imbalance in > calculate_imbalance(). > > There is no functional changes. > > Signed-off-by: Vincent Guittot Acked-by: Rik van Riel -- All Right

[PATCH 06/15] sched,cfs: use explicit cfs_rq of parent se helper

2019-09-06 Thread Rik van Riel
Use an explicit "cfs_rq of parent sched_entity" helper in a few strategic places, where cfs_rq_of(se) may no longer point at the right runqueue once we flatten the hierarchical cgroup runqueues. No functional change. Signed-off-by: Rik van Riel --- kernel/sched/fair.c | 17 ++

[PATCH 08/15] sched,fair: refactor enqueue/dequeue_entity

2019-09-06 Thread Rik van Riel
Refactor enqueue_entity, dequeue_entity, and update_load_avg, in order to split out the things we still want to happen at every level in the cgroup hierarchy with a flat runqueue from the things we only need to happen once. No functional changes. Signed-off-by: Rik van Riel --- kernel/sched

[PATCH 05/15] sched,fair: remove cfs_rqs from leaf_cfs_rq_list bottom up

2019-09-06 Thread Rik van Riel
nce it no longer has children on the list, we can avoid walking the sched_entity hierarchy if the bottom cfs_rq is on the list, once the runqueues have been flattened. Signed-off-by: Rik van Riel Suggested-by: Vincent Guittot Acked-by: Vincent Guittot --- kernel/sched/fair.c |

[PATCH 01/15] sched: introduce task_se_h_load helper

2019-09-06 Thread Rik van Riel
Sometimes the hierarchical load of a sched_entity needs to be calculated. Rename task_h_load to task_se_h_load, and directly pass a sched_entity to that function. Move the function declaration up above where it will be used later. No functional changes. Signed-off-by: Rik van Riel Reviewed

  1   2   3   4   5   6   7   8   9   10   >