files support PMD-mapped THP,
but both don't have to do writeback. And it seems DAX doesn't have
writeback either, which uses __set_page_dirty_no_writeback() for
set_page_dirty. So this code should never be called IIUC.
But anyway your fix looks correct to me. Reviewed-by: Yang Shi
>
>
On Tue, Apr 13, 2021 at 8:00 PM Huang, Ying wrote:
>
> Yang Shi writes:
>
> > The generic migration path will check refcount, so no need check refcount
> > here.
> > But the old code actually prevents from migrating shared THP (mapped by
> > multiple
On Tue, Apr 13, 2021 at 7:44 PM Huang, Ying wrote:
>
> Yang Shi writes:
>
> > When the THP NUMA fault support was added THP migration was not supported
> > yet.
> > So the ad hoc THP migration was implemented in NUMA fault handling. Since
> > v4.14
> &
faults on S390.
Signed-off-by: Yang Shi
---
mm/huge_memory.c | 4
1 file changed, 4 insertions(+)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 94981907fd4c..f63445f3a17d 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1741,6 +1741,7 @@ bool move_huge_pmd(struct
reworked a lot,
it seems anon_vma lock is not required anymore to avoid the race.
The page refcount elevation when holding ptl should prevent from THP
split.
Use migrate_misplaced_page() for both base page and THP NUMA hinting
fault and remove all the dead and duplicate code.
Signed-off-by: Yang Shi
Now both base page and THP NUMA migration is done via migrate_misplaced_page(),
keep the counters correctly for THP.
Signed-off-by: Yang Shi
---
mm/migrate.c | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index 333448aa53f1..a473f25fbd01
The old behavior didn't split THP if migration is failed due to lack of
memory on the target node. But the THP migration does split THP, so keep
the old behavior for misplaced NUMA page migration.
Signed-off-by: Yang Shi
---
mm/migrate.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion
The generic migration path will check refcount, so no need check refcount here.
But the old code actually prevents from migrating shared THP (mapped by multiple
processes), so bail out early if mapcount is > 1 to keep the behavior.
Signed-off-by: Yang Shi
---
mm/migrate.c |
#3 is the real meat.
Patch #4 ~ #6 keep consistent counters and behaviors with before.
Patch #7 skips change huge PMD to prot_none if thp migration is not supported.
Yang Shi (7):
mm: memory: add orig_pmd to struct vm_fault
mm: memory: make numa_migrate_prep() non-static
mm:
The numa_migrate_prep() will be used by huge NUMA fault as well in the following
patch, make it non-static.
Signed-off-by: Yang Shi
---
mm/internal.h | 3 +++
mm/memory.c | 5 ++---
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index f469f69309de
Add orig_pmd to struct vm_fault so the "orig_pmd" parameter used by huge page
fault could be removed, just like its PTE counterpart does.
Signed-off-by: Yang Shi
---
include/linux/huge_mm.h | 9 -
include/linux/mm.h | 3 +++
mm/huge_memory.c| 9 ++---
m
On Thu, Apr 8, 2021 at 7:58 PM Huang, Ying wrote:
>
> Yang Shi writes:
>
> > On Thu, Apr 8, 2021 at 10:19 AM Shakeel Butt wrote:
> >>
> >> Hi Tim,
> >>
> >> On Mon, Apr 5, 2021 at 11:08 AM Tim Chen
> >> wrote:
> >> >
>
On Fri, Apr 9, 2021 at 8:50 AM Dave Hansen wrote:
>
> On 4/8/21 11:17 AM, Oscar Salvador wrote:
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -8490,7 +8490,8 @@ static int __alloc_contig_migrate_range(struct
> > compact_control *cc,
> > cc->nr_migratepages -=
On Thu, Apr 8, 2021 at 10:06 PM Oscar Salvador wrote:
>
> On Thu, Apr 08, 2021 at 01:40:33PM -0700, Yang Shi wrote:
> > Thanks a lot for the example code. You didn't miss anything. At first
> > glance, I thought your suggestion seemed neater. Actually I
> > misunderst
On Thu, Apr 8, 2021 at 1:29 PM Shakeel Butt wrote:
>
> On Thu, Apr 8, 2021 at 11:01 AM Yang Shi wrote:
> >
> > On Thu, Apr 8, 2021 at 10:19 AM Shakeel Butt wrote:
> > >
> > > Hi Tim,
> > >
> > > On Mon, Apr 5, 2021 at 11:08 AM Tim Chen
On Thu, Apr 8, 2021 at 11:17 AM Oscar Salvador wrote:
>
> On Thu, Apr 08, 2021 at 10:26:54AM -0700, Yang Shi wrote:
>
> > Thanks, Oscar. Yes, kind of. But we have to remember to initialize
> > "nr_succedded" pointer properly for every migrate_pages() callsite,
&g
On Thu, Apr 8, 2021 at 10:19 AM Shakeel Butt wrote:
>
> Hi Tim,
>
> On Mon, Apr 5, 2021 at 11:08 AM Tim Chen wrote:
> >
> > Traditionally, all memory is DRAM. Some DRAM might be closer/faster than
> > others NUMA wise, but a byte of media has about the same cost whether it
> > is close or far.
On Thu, Apr 8, 2021 at 3:14 AM Oscar Salvador wrote:
>
> On Thu, Apr 01, 2021 at 11:32:23AM -0700, Dave Hansen wrote:
> >
> > From: Yang Shi
> >
> > The migrate_pages() returns the number of pages that were not migrated,
> > or an error code. When retu
On Tue, Apr 6, 2021 at 8:06 PM wrote:
>
> From: Yanfei Xu
>
> We could check MMF_DISABLE_THP ahead of iterating over all of vma.
> Otherwise if some mm_struct contain a large number of vma, there will
> be amounts meaningless cpu cycles cost.
Reviewed-by: Yang Shi
>
>
On Tue, Apr 6, 2021 at 8:06 PM wrote:
>
> From: Yanfei Xu
>
> We could use macro to deal with the addresses which need to be aligned
> to improve readability of codes.
Reviewed-by: Yang Shi
>
> Signed-off-by: Yanfei Xu
> ---
> mm/khugepaged.c | 27 ++
On Wed, Apr 7, 2021 at 1:32 AM Mel Gorman wrote:
>
> On Tue, Apr 06, 2021 at 09:42:07AM -0700, Yang Shi wrote:
> > On Tue, Apr 6, 2021 at 5:03 AM Gerald Schaefer
> > wrote:
> > >
> > > On Thu, 1 Apr 2021 13:10:49 -0700
> > > Yang Shi wrote:
On Tue, Apr 6, 2021 at 3:05 AM Bharata B Rao wrote:
>
> On Mon, Apr 05, 2021 at 11:08:26AM -0700, Yang Shi wrote:
> > On Sun, Apr 4, 2021 at 10:49 PM Bharata B Rao wrote:
> > >
> > > Hi,
> > >
> > > When running 1 (more-or-less-empty-)contain
On Mon, Apr 5, 2021 at 8:05 PM Xu, Yanfei wrote:
>
>
>
> On 4/6/21 10:51 AM, Xu, Yanfei wrote:
> >
> >
> > On 4/6/21 2:20 AM, Yang Shi wrote:
> >> [Please note: This e-mail is from an EXTERNAL e-mail address]
> >>
> >> On Sun, A
On Tue, Apr 6, 2021 at 5:03 AM Gerald Schaefer
wrote:
>
> On Thu, 1 Apr 2021 13:10:49 -0700
> Yang Shi wrote:
>
> [...]
> > > >
> > > > Yes, it could be. The old behavior of migration was to return -ENOMEM
> > > > if THP migra
On Sun, Apr 4, 2021 at 8:33 AM wrote:
>
> From: Yanfei Xu
>
> We could check MMF_DISABLE_THP ahead of iterating over all of vma.
> Otherwise if some mm_struct contain a large number of vma, there will
> be amounts meaningless cpu cycles cost.
>
> BTW, drop an unnecessary cond_resched(), because
On Sun, Apr 4, 2021 at 10:49 PM Bharata B Rao wrote:
>
> Hi,
>
> When running 1 (more-or-less-empty-)containers on a bare-metal Power9
> server(160 CPUs, 2 NUMA nodes, 256G memory), it is seen that memory
> consumption increases quite a lot (around 172G) when the containers are
> running.
On Wed, Mar 31, 2021 at 8:17 AM Johannes Weiner wrote:
>
> On Tue, Mar 30, 2021 at 03:05:42PM -0700, Roman Gushchin wrote:
> > On Tue, Mar 30, 2021 at 05:30:10PM -0400, Johannes Weiner wrote:
> > > On Tue, Mar 30, 2021 at 11:58:31AM -0700, Roman Gushchin wrote:
> > > > On Tue, Mar 30, 2021 at
On Wed, Mar 31, 2021 at 6:20 AM Mel Gorman wrote:
>
> On Tue, Mar 30, 2021 at 04:42:00PM +0200, Gerald Schaefer wrote:
> > Could there be a work-around by splitting THP pages instead of marking them
> > as migrate pmds (via pte swap entries), at least when THP migration is not
> > supported? I
On Wed, Mar 31, 2021 at 4:47 AM Gerald Schaefer
wrote:
>
> On Tue, 30 Mar 2021 09:51:46 -0700
> Yang Shi wrote:
>
> > On Tue, Mar 30, 2021 at 7:42 AM Gerald Schaefer
> > wrote:
> > >
> > > On Mon, 29 Mar 2021 11:33:06 -0700
> > > Yang Shi w
led page demotion may move data to a NUMA node
> that does not fall into the cpuset of the allocating process.
> This could be construed to violate the guarantees of cpusets.
> However, since this is an opt-in mechanism, the assumption is
> that anyone enabling it is content to relax the
On Thu, Apr 1, 2021 at 11:35 AM Dave Hansen wrote:
>
>
> From: Dave Hansen
>
> This is mostly derived from a patch from Yang Shi:
>
>
> https://lore.kernel.org/linux-mm/1560468577-101178-10-git-send-email-yang@linux.alibaba.com/
>
> Add code to the
On Wed, Mar 31, 2021 at 2:13 PM Hugh Dickins wrote:
>
> On Wed, 31 Mar 2021, Yang Shi wrote:
> > On Wed, Mar 31, 2021 at 6:54 AM Shakeel Butt wrote:
> > > On Tue, Mar 30, 2021 at 4:44 PM Hugh Dickins wrote:
> > > >
> > > > Lockdep warns mm/vmscan.c:
15.GB28839@xsang-OptiPlex-9020
> > Reported-by: kernel test robot
> > Signed-off-by: Hugh Dickins
> > Cc: Yang Shi
> > ---
> > Sorry, I've made no attempt to work out precisely where in the series
> > the locking went missing, nor tried to fit this in as a fix
Since commit 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT")
and commit ba925fa35057 ("s390/gmap: improve THP splitting") FOLL_SPLIT
has not been used anymore. Remove the dead code.
Reviewed-by: John Hubbard
Signed-off-by: Yang Shi
---
v2:
On Mon, Mar 29, 2021 at 5:41 PM Huang, Ying wrote:
>
> Yang Shi writes:
>
> > When the THP NUMA fault support was added THP migration was not supported
> > yet.
> > So the ad hoc THP migration was implemented in NUMA fault handling. Since
> > v4.14
> &
On Mon, Mar 29, 2021 at 5:21 PM Huang, Ying wrote:
>
> Yang Shi writes:
>
> > In the following patch the migrate_misplaced_page() will be used to migrate
> > THP
> > for NUMA faul too. Prepare to deal with THP.
> >
> > Signed-off-by: Yang Shi
&
On Tue, Mar 30, 2021 at 7:42 AM Gerald Schaefer
wrote:
>
> On Mon, 29 Mar 2021 11:33:11 -0700
> Yang Shi wrote:
>
> > The old behavior didn't split THP if migration is failed due to lack of
> > memory on the target node. But the THP migration does split THP, so k
On Tue, Mar 30, 2021 at 7:42 AM Gerald Schaefer
wrote:
>
> On Mon, 29 Mar 2021 11:33:06 -0700
> Yang Shi wrote:
>
> >
> > When the THP NUMA fault support was added THP migration was not supported
> > yet.
> > So the ad hoc THP migration was implemented in NU
On Tue, Mar 30, 2021 at 12:08 AM John Hubbard wrote:
>
> On 3/29/21 12:38 PM, Yang Shi wrote:
> > Since commit 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of
> > FOLL_SPLIT")
> > and commit ba925fa35057 ("s390/gmap: improve THP splitting") FOLL_
Since commit 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT")
and commit ba925fa35057 ("s390/gmap: improve THP splitting") FOLL_SPLIT
has not been used anymore. Remove the dead code.
Signed-off-by: Yang Shi
---
include/linux/mm.h | 1 -
mm/
The old behavior didn't split THP if migration is failed due to lack of
memory on the target node. But the THP migration does split THP, so keep
the old behavior for misplaced NUMA page migration.
Signed-off-by: Yang Shi
---
mm/migrate.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion
is not required anymore to avoid the race.
The page refcount elevation when holding ptl should prevent from THP
split.
Signed-off-by: Yang Shi
---
include/linux/migrate.h | 23 --
mm/huge_memory.c| 132 --
mm/migrate.c| 173
Don't have to keep the redundant page count check for THP anymore after
switching to use generic migration code.
Signed-off-by: Yang Shi
---
mm/migrate.c | 12
1 file changed, 12 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index 1c0c873375ab..328f76848d6c 100644
--- a/mm
The numa_migrate_prep() will be used by huge NUMA fault as well in the following
patch, make it non-static.
Signed-off-by: Yang Shi
---
mm/internal.h | 3 +++
mm/memory.c | 5 ++---
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index 1432feec62df
saw there were some hacks about gup from git history, but I didn't figure out
if they have been removed or not since I just found FOLL_NUMA code in the
current
gup implementation and they seems useful.
Yang Shi (6):
mm: memory: add orig_pmd to struct vm_fault
mm: memory: make
Add orig_pmd to struct vm_fault so the "orig_pmd" parameter used by huge page
fault could be removed, just like its PTE counterpart does.
Signed-off-by: Yang Shi
---
include/linux/huge_mm.h | 9 -
include/linux/mm.h | 1 +
mm/huge_memory.c| 9 ++---
m
In the following patch the migrate_misplaced_page() will be used to migrate THP
for NUMA faul too. Prepare to deal with THP.
Signed-off-by: Yang Shi
---
include/linux/migrate.h | 6 --
mm/memory.c | 2 +-
mm/migrate.c| 2 +-
3 files changed, 6 insertions(+), 4
and remove all the 3 VM_BUG_ON_PAGE().
Reviewed-by: Yang Shi
>
> Signed-off-by: Miaohe Lin
> ---
> include/linux/migrate.h | 1 -
> mm/migrate.c| 7 +--
> 2 files changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/include/linux/migrate.h b/inclu
On Tue, Mar 23, 2021 at 10:17 AM Yang Shi wrote:
>
> On Tue, Mar 23, 2021 at 6:55 AM Miaohe Lin wrote:
> >
> > Since commit c77c5cbafe54 ("mm: migrate: skip shared exec THP for NUMA
> > balancing"), the NUMA balancing would skip shared exec transh
On Tue, Mar 23, 2021 at 6:54 AM Miaohe Lin wrote:
>
> The !PageLocked() check is implicitly done in PageMovable(). Remove this
> explicit one.
TBH, I'm a little bit reluctant to have this kind change. If "locked"
check is necessary we'd better make it explicit otherwise just remove
it.
And why
On Tue, Mar 23, 2021 at 6:54 AM Miaohe Lin wrote:
>
> It's guaranteed that in the 'else' case of the rc == MIGRATEPAGE_SUCCESS
> check, rc does not equal to MIGRATEPAGE_SUCCESS. Remove this unnecessary
> check.
Reviewed-by: Yang Shi
>
> Reviewed-by: David Hildenbrand
>
the first
place.
Your fix is correct, and please add the above justification to your commit log.
Reviewed-by: Yang Shi
>
> Fixes: c77c5cbafe54 ("mm: migrate: skip shared exec THP for NUMA balancing")
> Signed-off-by: Miaohe Lin
> ---
> mm/migrate.c | 4
> 1 file cha
On Sun, Mar 21, 2021 at 7:11 PM Zi Yan wrote:
>
> On 19 Mar 2021, at 19:37, Yang Shi wrote:
>
> > On Thu, Mar 18, 2021 at 5:52 PM Zi Yan wrote:
> >>
> >> From: Zi Yan
> >>
> >> We did not have a direct user interface of splitting the compound
put_page(fpage);
> + }
> +
> + filp_close(candidate, NULL);
> + ret = 0;
> +
> + pr_info("%lu of %lu file-backed THP split\n", split, total);
> +out:
> + putname(file);
> + return ret;
> +}
> +
> +#define MAX_INPUT
elftests/vm to utilize the interface by splitting
> PMD THPs and PTE-mapped THPs.
>
> This does not change the old behavior, i.e., writing 1 to the interface
> to split all THPs in the system.
>
> Changelog:
>
> From v5:
> 1. Skipped special VMAs and other fixes. (sugge
On Wed, Mar 17, 2021 at 8:00 AM Zi Yan wrote:
>
> On 16 Mar 2021, at 19:18, Yang Shi wrote:
>
> > On Mon, Mar 15, 2021 at 1:34 PM Zi Yan wrote:
> >>
> >> From: Zi Yan
> >>
> >> Further extend /split_huge_pages to accept
> >> ",
On Mon, Mar 15, 2021 at 1:34 PM Zi Yan wrote:
>
> From: Zi Yan
>
> Further extend /split_huge_pages to accept
> ",," for file-backed THP split tests since
> tmpfs may have file backed by THP that mapped nowhere.
>
> Update selftest program to test file-backed THP split too.
>
> Suggested-by:
given pid code to a separate
>function.
> 2. Added the missing put_page for not split pages.
> 3. pr_debug -> pr_info, make reading results simpler.
>
> From v2:
>
> 1. Reused existing /split_huge_pages interface. (suggested by
>Yang Shi)
>
> From v1:
&
On Mon, Mar 15, 2021 at 12:15 PM Roman Gushchin wrote:
>
>
> On Mon, Mar 15, 2021 at 07:49:57PM +0100, Vlastimil Babka wrote:
> > On 3/9/21 4:25 PM, Xunlei Pang wrote:
> > > count_partial() can hold n->list_lock spinlock for quite long, which
> > > makes much trouble to the system. This series
On Mon, Mar 15, 2021 at 11:37 AM Zi Yan wrote:
>
> On 15 Mar 2021, at 8:07, Kirill A. Shutemov wrote:
>
> > On Thu, Mar 11, 2021 at 07:57:12PM -0500, Zi Yan wrote:
> >> From: Zi Yan
> >>
> >> We do not have a direct user interface of splitting the compound page
> >> backing a THP
> >
> > But we
On Fri, Mar 12, 2021 at 11:57 PM Yu Zhao wrote:
>
> TLDR
>
> The current page reclaim is too expensive in terms of CPU usage and
> often making poor choices about what to evict. We would like to offer
> a performant, versatile and straightforward augment.
>
> Repo
>
> git fetch
On Thu, Mar 11, 2021 at 7:52 AM Zi Yan wrote:
>
> On 10 Mar 2021, at 20:12, Yang Shi wrote:
>
> > On Wed, Mar 10, 2021 at 7:36 AM Zi Yan wrote:
> >>
> >> From: Zi Yan
> >>
> >> We do not have a direct user interface of splitting the compound
patch:
https://lore.kernel.org/linux-xfs/20191031234618.15403-13-da...@fromorbit.com/
Tested with kernel build and vfs metadata heavy workload in our production
environment, no regression is spotted so far.
Signed-off-by: Yang Shi
---
mm/vmscan.c | 46
Now shrinker's nr_deferred is per memcg for memcg aware shrinkers, add to
parent's
corresponding nr_deferred when memcg offline.
Acked-by: Vlastimil Babka
Acked-by: Kirill Tkhai
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
include/linux/memcontrol.h | 1
r's SHRINKER_MEMCG_AWARE flag would be
cleared.
This makes the implementation of this patch simpler.
Acked-by: Vlastimil Babka
Reviewed-by: Kirill Tkhai
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
mm/vmscan.c | 31 ---
1 file changed, 16 inserti
When memcg is not enabled (!CONFIG_MEMCG or memcg disabled), the shrinker's
nr_deferred would be used. And non memcg aware shrinkers use shrinker's
nr_deferred all the time.
Acked-by: Roman Gushchin
Acked-by: Kirill Tkhai
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
include/linux/memco
Signed-off-by: Yang Shi
---
mm/vmscan.c | 78 -
1 file changed, 66 insertions(+), 12 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 397f3b67bad8..5bc6975cb635 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -376,6 +376,24 @@ static void
xtract the dereference into a helper to make the code more readable. No
functional change.
Acked-by: Roman Gushchin
Acked-by: Kirill Tkhai
Acked-by: Vlastimil Babka
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
mm/vmscan.c | 14 ++
1 file changed, 10 insertions(+), 4 deletions(-)
would prevent the shrinkers
from unregistering correctly.
Remove SHRINKER_REGISTERING since we could check if shrinker is registered
successfully by the new flag.
Acked-by: Kirill Tkhai
Acked-by: Vlastimil Babka
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
include
he "memcg_" prefix.
Acked-by: Vlastimil Babka
Acked-by: Kirill Tkhai
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
include/linux/memcontrol.h | 8 +++---
mm/memcontrol.c| 6 ++--
mm/vmscan.c| 58 +++
Using kvfree_rcu() to free the old shrinker_maps instead of call_rcu().
We don't have to define a dedicated callback for call_rcu() anymore.
Acked-by: Roman Gushchin
Acked-by: Kirill Tkhai
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
mm/vmscan.c | 7 +--
1 file changed, 1
-by: Roman Gushchin
Acked-by: Vlastimil Babka
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
mm/vmscan.c | 20 +++-
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index b08c8d9055ae..641a0b8b4ea9 100644
--- a/mm/vmscan.c
+++ b/mm
larity.
And a test with heavy paging workload didn't show write lock makes things worse.
Acked-by: Vlastimil Babka
Acked-by: Kirill Tkhai
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
mm/vmscan.c | 18 --
1 file changed, 8 insertions(+), 10 del
for tighter integration with shrinker
code,
and remove the "memcg_" prefix. There is no functional change.
Acked-by: Vlastimil Babka
Acked-by: Kirill Tkhai
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
include/linux/memcontrol.h | 11 ++--
mm/hug
mcg
needs ~320 bytes. 10K memcgs would need ~3.2MB memory. It seems fine.
We have been running the patched kernel on some hosts of our fleet (test and
production) for
months, it works very well. The monitor data shows the working set is sustained
as expected.
Yang Shi (13):
mm: vmscan: use n
. It seems confusing. And the following patch
will remove using nid directly in do_shrink_slab(), this patch also helps
cleanup
the code.
Acked-by: Vlastimil Babka
Acked-by: Kirill Tkhai
Reviewed-by: Shakeel Butt
Acked-by: Roman Gushchin
Signed-off-by: Yang Shi
---
mm/vmscan.c | 2 +-
1 file
On Wed, Mar 10, 2021 at 7:36 AM Zi Yan wrote:
>
> From: Zi Yan
>
> We do not have a direct user interface of splitting the compound page
> backing a THP and there is no need unless we want to expose the THP
> implementation details to users. Adding an interface for debugging.
>
> By writing ",,"
On Wed, Mar 10, 2021 at 2:41 PM Shakeel Butt wrote:
>
> On Wed, Mar 10, 2021 at 1:41 PM Yang Shi wrote:
> >
> > On Wed, Mar 10, 2021 at 1:08 PM Shakeel Butt wrote:
> > >
> > > On Wed, Mar 10, 2021 at 10:54 AM Yang Shi wrote:
> > > >
>
On Wed, Mar 10, 2021 at 1:08 PM Shakeel Butt wrote:
>
> On Wed, Mar 10, 2021 at 10:54 AM Yang Shi wrote:
> >
> > On Wed, Mar 10, 2021 at 10:24 AM Shakeel Butt wrote:
> > >
> > > On Wed, Mar 10, 2021 at 9:46 AM Yang Shi wrote:
> > > >
> &g
On Wed, Mar 10, 2021 at 10:24 AM Shakeel Butt wrote:
>
> On Wed, Mar 10, 2021 at 9:46 AM Yang Shi wrote:
> >
> > The number of deferred objects might get windup to an absurd number, and it
> > results in clamp of slab objects. It is undesirable for sustaining
> >
patch:
https://lore.kernel.org/linux-xfs/20191031234618.15403-13-da...@fromorbit.com/
Tested with kernel build and vfs metadata heavy workload in our production
environment, no regression is spotted so far.
Signed-off-by: Yang Shi
---
mm/vmscan.c | 46
r's SHRINKER_MEMCG_AWARE flag would be
cleared.
This makes the implementation of this patch simpler.
Acked-by: Vlastimil Babka
Reviewed-by: Kirill Tkhai
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
mm/vmscan.c | 31 ---
1 file changed, 16 inserti
Now shrinker's nr_deferred is per memcg for memcg aware shrinkers, add to
parent's
corresponding nr_deferred when memcg offline.
Acked-by: Vlastimil Babka
Acked-by: Kirill Tkhai
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
include/linux/memcontrol.h | 1
Signed-off-by: Yang Shi
---
mm/vmscan.c | 78 -
1 file changed, 66 insertions(+), 12 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index ae82afe6cec6..326f0e0c4356 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -374,6 +374,24 @@ static void
would prevent the shrinkers
from unregistering correctly.
Remove SHRINKER_REGISTERING since we could check if shrinker is registered
successfully by the new flag.
Acked-by: Kirill Tkhai
Acked-by: Vlastimil Babka
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
include
When memcg is not enabled (!CONFIG_MEMCG or memcg disabled), the shrinker's
nr_deferred would be used. And non memcg aware shrinkers use shrinker's
nr_deferred all the time.
Acked-by: Roman Gushchin
Acked-by: Kirill Tkhai
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
include/linux/memco
he "memcg_" prefix.
Acked-by: Vlastimil Babka
Acked-by: Kirill Tkhai
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
include/linux/memcontrol.h | 8 +++---
mm/memcontrol.c| 6 ++--
mm/vmscan.c| 58 +++
xtract the dereference into a helper to make the code more readable. No
functional change.
Acked-by: Roman Gushchin
Acked-by: Kirill Tkhai
Acked-by: Vlastimil Babka
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
mm/vmscan.c | 15 ++-
1 file changed, 10 insertions(+), 5 del
-by: Roman Gushchin
Acked-by: Vlastimil Babka
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
mm/vmscan.c | 20 +++-
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 75fd8038a6c8..bda67e1ac84b 100644
--- a/mm/vmscan.c
+++ b/mm
Using kvfree_rcu() to free the old shrinker_maps instead of call_rcu().
We don't have to define a dedicated callback for call_rcu() anymore.
Acked-by: Roman Gushchin
Acked-by: Kirill Tkhai
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
mm/vmscan.c | 7 +--
1 file changed, 1
larity.
And a test with heavy paging workload didn't show write lock makes things worse.
Acked-by: Vlastimil Babka
Acked-by: Kirill Tkhai
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
mm/vmscan.c | 18 --
1 file changed, 8 insertions(+), 10 del
for tighter integration with shrinker
code,
and remove the "memcg_" prefix. There is no functional change.
Acked-by: Vlastimil Babka
Acked-by: Kirill Tkhai
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
Signed-off-by: Yang Shi
---
include/linux/memcontrol.h | 11 ++--
mm/hug
nd
production) for
months, it works very well. The monitor data shows the working set is sustained
as expected.
Yang Shi (13):
mm: vmscan: use nid from shrink_control for tracepoint
mm: vmscan: consolidate shrinker_maps handling code
mm: vmscan: use shrinker_rwsem to protect
. It seems confusing. And the following patch
will remove using nid directly in do_shrink_slab(), this patch also helps
cleanup
the code.
Acked-by: Vlastimil Babka
Acked-by: Kirill Tkhai
Reviewed-by: Shakeel Butt
Acked-by: Roman Gushchin
Signed-off-by: Yang Shi
---
mm/vmscan.c | 2 +-
1 file
t;
> The meat of this patch is in:
>
> [PATCH 05/10] mm/migrate: demote pages during reclaim
>
> Which also has the most changes since the last post. This version is
> mostly to address review comments from Yang Shi and Oscar Salvador.
> Review comments are documented in the individua
the guarantees.
I think we'd better have the cpuset violation paragraph along with new
zone reclaim mode text so that the users are aware of the potential
violation. I don't think commit log is the to-go place for any plain
users.
>
> Signed-off-by: Dave Hansen
> Cc: Yang Shi
> C
al is to reduce the
> total memory consumption of the entire memcg, across all
> nodes. Migration does not assist memcg reclaim because
> it just moves page contents between nodes rather than
> actually reducing memory consumption.
Reviewed-by: Yang Shi
>
> Signed-off-by: Dave
/preliminary
> which just says whether there is a possibility of future reclaim.
Reviewed-by: Yang Shi
>
> #Signed-off-by: Keith Busch
> Cc: Keith Busch
> Signed-off-by: Dave Hansen
> Cc: Yang Shi
> Cc: David Rientjes
> Cc: Huang Ying
> Cc: Dan Williams
> Cc: David Hildenbran
tal_swap_pages' checks into a helper, give it a
> logically significant name, and check for the possibility of page
> demotion.
Reviewed-by: Yang Shi
>
> Signed-off-by: Dave Hansen
> Cc: David Rientjes
> Cc: Huang Ying
> Cc: Dan Williams
> Cc: David Hildenbrand
> Cc: osalvador
&
On Thu, Mar 4, 2021 at 4:01 PM Dave Hansen wrote:
>
>
> From: Yang Shi
>
> Account the number of demoted pages into reclaim_state->nr_demoted.
>
> Add pgdemote_kswapd and pgdemote_direct VM counters showed in
> /proc/vmstat.
>
> [ daveh:
>- __count_v
1 - 100 of 1968 matches
Mail list logo