Re: [PATCH 1/5] mm: rmap: fix cache flush on THP pages

2022-01-21 Thread Yang Shi
files support PMD-mapped THP, but both don't have to do writeback. And it seems DAX doesn't have writeback either, which uses __set_page_dirty_no_writeback() for set_page_dirty. So this code should never be called IIUC. But anyway your fix looks correct to me. Reviewed-by: Yang Shi > >

Re: [v2 PATCH 6/7] mm: migrate: check mapcount for THP instead of ref count

2021-04-14 Thread Yang Shi
On Tue, Apr 13, 2021 at 8:00 PM Huang, Ying wrote: > > Yang Shi writes: > > > The generic migration path will check refcount, so no need check refcount > > here. > > But the old code actually prevents from migrating shared THP (mapped by > > multiple

Re: [v2 PATCH 3/7] mm: thp: refactor NUMA fault handling

2021-04-14 Thread Yang Shi
On Tue, Apr 13, 2021 at 7:44 PM Huang, Ying wrote: > > Yang Shi writes: > > > When the THP NUMA fault support was added THP migration was not supported > > yet. > > So the ad hoc THP migration was implemented in NUMA fault handling. Since > > v4.14 > &

[v2 PATCH 7/7] mm: thp: skip make PMD PROT_NONE if THP migration is not supported

2021-04-13 Thread Yang Shi
faults on S390. Signed-off-by: Yang Shi --- mm/huge_memory.c | 4 1 file changed, 4 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 94981907fd4c..f63445f3a17d 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1741,6 +1741,7 @@ bool move_huge_pmd(struct

[v2 PATCH 3/7] mm: thp: refactor NUMA fault handling

2021-04-13 Thread Yang Shi
reworked a lot, it seems anon_vma lock is not required anymore to avoid the race. The page refcount elevation when holding ptl should prevent from THP split. Use migrate_misplaced_page() for both base page and THP NUMA hinting fault and remove all the dead and duplicate code. Signed-off-by: Yang Shi

[v2 PATCH 4/7] mm: migrate: account THP NUMA migration counters correctly

2021-04-13 Thread Yang Shi
Now both base page and THP NUMA migration is done via migrate_misplaced_page(), keep the counters correctly for THP. Signed-off-by: Yang Shi --- mm/migrate.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 333448aa53f1..a473f25fbd01

[v2 PATCH 5/7] mm: migrate: don't split THP for misplaced NUMA page

2021-04-13 Thread Yang Shi
The old behavior didn't split THP if migration is failed due to lack of memory on the target node. But the THP migration does split THP, so keep the old behavior for misplaced NUMA page migration. Signed-off-by: Yang Shi --- mm/migrate.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion

[v2 PATCH 6/7] mm: migrate: check mapcount for THP instead of ref count

2021-04-13 Thread Yang Shi
The generic migration path will check refcount, so no need check refcount here. But the old code actually prevents from migrating shared THP (mapped by multiple processes), so bail out early if mapcount is > 1 to keep the behavior. Signed-off-by: Yang Shi --- mm/migrate.c |

[v2 RFC PATCH 0/7] mm: thp: use generic THP migration for NUMA hinting fault

2021-04-13 Thread Yang Shi
#3 is the real meat. Patch #4 ~ #6 keep consistent counters and behaviors with before. Patch #7 skips change huge PMD to prot_none if thp migration is not supported. Yang Shi (7): mm: memory: add orig_pmd to struct vm_fault mm: memory: make numa_migrate_prep() non-static mm:

[v2 PATCH 2/7] mm: memory: make numa_migrate_prep() non-static

2021-04-13 Thread Yang Shi
The numa_migrate_prep() will be used by huge NUMA fault as well in the following patch, make it non-static. Signed-off-by: Yang Shi --- mm/internal.h | 3 +++ mm/memory.c | 5 ++--- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index f469f69309de

[v2 PATCH 1/7] mm: memory: add orig_pmd to struct vm_fault

2021-04-13 Thread Yang Shi
Add orig_pmd to struct vm_fault so the "orig_pmd" parameter used by huge page fault could be removed, just like its PTE counterpart does. Signed-off-by: Yang Shi --- include/linux/huge_mm.h | 9 - include/linux/mm.h | 3 +++ mm/huge_memory.c| 9 ++--- m

Re: [RFC PATCH v1 00/11] Manage the top tier memory in a tiered memory

2021-04-09 Thread Yang Shi
On Thu, Apr 8, 2021 at 7:58 PM Huang, Ying wrote: > > Yang Shi writes: > > > On Thu, Apr 8, 2021 at 10:19 AM Shakeel Butt wrote: > >> > >> Hi Tim, > >> > >> On Mon, Apr 5, 2021 at 11:08 AM Tim Chen > >> wrote: > >> > >

Re: [PATCH 04/10] mm/migrate: make migrate_pages() return nr_succeeded

2021-04-09 Thread Yang Shi
On Fri, Apr 9, 2021 at 8:50 AM Dave Hansen wrote: > > On 4/8/21 11:17 AM, Oscar Salvador wrote: > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -8490,7 +8490,8 @@ static int __alloc_contig_migrate_range(struct > > compact_control *cc, > > cc->nr_migratepages -=

Re: [PATCH 04/10] mm/migrate: make migrate_pages() return nr_succeeded

2021-04-09 Thread Yang Shi
On Thu, Apr 8, 2021 at 10:06 PM Oscar Salvador wrote: > > On Thu, Apr 08, 2021 at 01:40:33PM -0700, Yang Shi wrote: > > Thanks a lot for the example code. You didn't miss anything. At first > > glance, I thought your suggestion seemed neater. Actually I > > misunderst

Re: [RFC PATCH v1 00/11] Manage the top tier memory in a tiered memory

2021-04-08 Thread Yang Shi
On Thu, Apr 8, 2021 at 1:29 PM Shakeel Butt wrote: > > On Thu, Apr 8, 2021 at 11:01 AM Yang Shi wrote: > > > > On Thu, Apr 8, 2021 at 10:19 AM Shakeel Butt wrote: > > > > > > Hi Tim, > > > > > > On Mon, Apr 5, 2021 at 11:08 AM Tim Chen

Re: [PATCH 04/10] mm/migrate: make migrate_pages() return nr_succeeded

2021-04-08 Thread Yang Shi
On Thu, Apr 8, 2021 at 11:17 AM Oscar Salvador wrote: > > On Thu, Apr 08, 2021 at 10:26:54AM -0700, Yang Shi wrote: > > > Thanks, Oscar. Yes, kind of. But we have to remember to initialize > > "nr_succedded" pointer properly for every migrate_pages() callsite, &g

Re: [RFC PATCH v1 00/11] Manage the top tier memory in a tiered memory

2021-04-08 Thread Yang Shi
On Thu, Apr 8, 2021 at 10:19 AM Shakeel Butt wrote: > > Hi Tim, > > On Mon, Apr 5, 2021 at 11:08 AM Tim Chen wrote: > > > > Traditionally, all memory is DRAM. Some DRAM might be closer/faster than > > others NUMA wise, but a byte of media has about the same cost whether it > > is close or far.

Re: [PATCH 04/10] mm/migrate: make migrate_pages() return nr_succeeded

2021-04-08 Thread Yang Shi
On Thu, Apr 8, 2021 at 3:14 AM Oscar Salvador wrote: > > On Thu, Apr 01, 2021 at 11:32:23AM -0700, Dave Hansen wrote: > > > > From: Yang Shi > > > > The migrate_pages() returns the number of pages that were not migrated, > > or an error code. When retu

Re: [PATCH v2 2/2] mm: khugepaged: check MMF_DISABLE_THP ahead of iterating over vmas

2021-04-07 Thread Yang Shi
On Tue, Apr 6, 2021 at 8:06 PM wrote: > > From: Yanfei Xu > > We could check MMF_DISABLE_THP ahead of iterating over all of vma. > Otherwise if some mm_struct contain a large number of vma, there will > be amounts meaningless cpu cycles cost. Reviewed-by: Yang Shi > >

Re: [PATCH v2 1/2] mm: khugepaged: use macro to align addresses

2021-04-07 Thread Yang Shi
On Tue, Apr 6, 2021 at 8:06 PM wrote: > > From: Yanfei Xu > > We could use macro to deal with the addresses which need to be aligned > to improve readability of codes. Reviewed-by: Yang Shi > > Signed-off-by: Yanfei Xu > --- > mm/khugepaged.c | 27 ++

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-04-07 Thread Yang Shi
On Wed, Apr 7, 2021 at 1:32 AM Mel Gorman wrote: > > On Tue, Apr 06, 2021 at 09:42:07AM -0700, Yang Shi wrote: > > On Tue, Apr 6, 2021 at 5:03 AM Gerald Schaefer > > wrote: > > > > > > On Thu, 1 Apr 2021 13:10:49 -0700 > > > Yang Shi wrote:

Re: High kmalloc-32 slab cache consumption with 10k containers

2021-04-06 Thread Yang Shi
On Tue, Apr 6, 2021 at 3:05 AM Bharata B Rao wrote: > > On Mon, Apr 05, 2021 at 11:08:26AM -0700, Yang Shi wrote: > > On Sun, Apr 4, 2021 at 10:49 PM Bharata B Rao wrote: > > > > > > Hi, > > > > > > When running 1 (more-or-less-empty-)contain

Re: [PATCH 2/2] mm: khugepaged: check MMF_DISABLE_THP ahead of iterating over vmas

2021-04-06 Thread Yang Shi
On Mon, Apr 5, 2021 at 8:05 PM Xu, Yanfei wrote: > > > > On 4/6/21 10:51 AM, Xu, Yanfei wrote: > > > > > > On 4/6/21 2:20 AM, Yang Shi wrote: > >> [Please note: This e-mail is from an EXTERNAL e-mail address] > >> > >> On Sun, A

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-04-06 Thread Yang Shi
On Tue, Apr 6, 2021 at 5:03 AM Gerald Schaefer wrote: > > On Thu, 1 Apr 2021 13:10:49 -0700 > Yang Shi wrote: > > [...] > > > > > > > > Yes, it could be. The old behavior of migration was to return -ENOMEM > > > > if THP migra

Re: [PATCH 2/2] mm: khugepaged: check MMF_DISABLE_THP ahead of iterating over vmas

2021-04-05 Thread Yang Shi
On Sun, Apr 4, 2021 at 8:33 AM wrote: > > From: Yanfei Xu > > We could check MMF_DISABLE_THP ahead of iterating over all of vma. > Otherwise if some mm_struct contain a large number of vma, there will > be amounts meaningless cpu cycles cost. > > BTW, drop an unnecessary cond_resched(), because

Re: High kmalloc-32 slab cache consumption with 10k containers

2021-04-05 Thread Yang Shi
On Sun, Apr 4, 2021 at 10:49 PM Bharata B Rao wrote: > > Hi, > > When running 1 (more-or-less-empty-)containers on a bare-metal Power9 > server(160 CPUs, 2 NUMA nodes, 256G memory), it is seen that memory > consumption increases quite a lot (around 172G) when the containers are > running.

Re: [RFC PATCH 00/15] Use obj_cgroup APIs to charge the LRU pages

2021-04-01 Thread Yang Shi
On Wed, Mar 31, 2021 at 8:17 AM Johannes Weiner wrote: > > On Tue, Mar 30, 2021 at 03:05:42PM -0700, Roman Gushchin wrote: > > On Tue, Mar 30, 2021 at 05:30:10PM -0400, Johannes Weiner wrote: > > > On Tue, Mar 30, 2021 at 11:58:31AM -0700, Roman Gushchin wrote: > > > > On Tue, Mar 30, 2021 at

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-04-01 Thread Yang Shi
On Wed, Mar 31, 2021 at 6:20 AM Mel Gorman wrote: > > On Tue, Mar 30, 2021 at 04:42:00PM +0200, Gerald Schaefer wrote: > > Could there be a work-around by splitting THP pages instead of marking them > > as migrate pmds (via pte swap entries), at least when THP migration is not > > supported? I

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-04-01 Thread Yang Shi
On Wed, Mar 31, 2021 at 4:47 AM Gerald Schaefer wrote: > > On Tue, 30 Mar 2021 09:51:46 -0700 > Yang Shi wrote: > > > On Tue, Mar 30, 2021 at 7:42 AM Gerald Schaefer > > wrote: > > > > > > On Mon, 29 Mar 2021 11:33:06 -0700 > > > Yang Shi w

Re: [PATCH 10/10] mm/migrate: new zone_reclaim_mode to enable reclaim migration

2021-04-01 Thread Yang Shi
led page demotion may move data to a NUMA node > that does not fall into the cpuset of the allocating process. > This could be construed to violate the guarantees of cpusets. > However, since this is an opt-in mechanism, the assumption is > that anyone enabling it is content to relax the

Re: [PATCH 05/10] mm/migrate: demote pages during reclaim

2021-04-01 Thread Yang Shi
On Thu, Apr 1, 2021 at 11:35 AM Dave Hansen wrote: > > > From: Dave Hansen > > This is mostly derived from a patch from Yang Shi: > > > https://lore.kernel.org/linux-mm/1560468577-101178-10-git-send-email-yang@linux.alibaba.com/ > > Add code to the

Re: [PATCH mmotm] mm: vmscan: fix shrinker_rwsem in free_shrinker_info()

2021-03-31 Thread Yang Shi
On Wed, Mar 31, 2021 at 2:13 PM Hugh Dickins wrote: > > On Wed, 31 Mar 2021, Yang Shi wrote: > > On Wed, Mar 31, 2021 at 6:54 AM Shakeel Butt wrote: > > > On Tue, Mar 30, 2021 at 4:44 PM Hugh Dickins wrote: > > > > > > > > Lockdep warns mm/vmscan.c:

Re: [PATCH mmotm] mm: vmscan: fix shrinker_rwsem in free_shrinker_info()

2021-03-31 Thread Yang Shi
15.GB28839@xsang-OptiPlex-9020 > > Reported-by: kernel test robot > > Signed-off-by: Hugh Dickins > > Cc: Yang Shi > > --- > > Sorry, I've made no attempt to work out precisely where in the series > > the locking went missing, nor tried to fit this in as a fix

[v2 PATCH] mm: gup: remove FOLL_SPLIT

2021-03-30 Thread Yang Shi
Since commit 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT") and commit ba925fa35057 ("s390/gmap: improve THP splitting") FOLL_SPLIT has not been used anymore. Remove the dead code. Reviewed-by: John Hubbard Signed-off-by: Yang Shi --- v2:

Re: [PATCH 4/6] mm: thp: refactor NUMA fault handling

2021-03-30 Thread Yang Shi
On Mon, Mar 29, 2021 at 5:41 PM Huang, Ying wrote: > > Yang Shi writes: > > > When the THP NUMA fault support was added THP migration was not supported > > yet. > > So the ad hoc THP migration was implemented in NUMA fault handling. Since > > v4.14 > &

Re: [PATCH 3/6] mm: migrate: teach migrate_misplaced_page() about THP

2021-03-30 Thread Yang Shi
On Mon, Mar 29, 2021 at 5:21 PM Huang, Ying wrote: > > Yang Shi writes: > > > In the following patch the migrate_misplaced_page() will be used to migrate > > THP > > for NUMA faul too. Prepare to deal with THP. > > > > Signed-off-by: Yang Shi &

Re: [PATCH 5/6] mm: migrate: don't split THP for misplaced NUMA page

2021-03-30 Thread Yang Shi
On Tue, Mar 30, 2021 at 7:42 AM Gerald Schaefer wrote: > > On Mon, 29 Mar 2021 11:33:11 -0700 > Yang Shi wrote: > > > The old behavior didn't split THP if migration is failed due to lack of > > memory on the target node. But the THP migration does split THP, so k

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-03-30 Thread Yang Shi
On Tue, Mar 30, 2021 at 7:42 AM Gerald Schaefer wrote: > > On Mon, 29 Mar 2021 11:33:06 -0700 > Yang Shi wrote: > > > > > When the THP NUMA fault support was added THP migration was not supported > > yet. > > So the ad hoc THP migration was implemented in NU

Re: [PATCH] mm: gup: remove FOLL_SPLIT

2021-03-30 Thread Yang Shi
On Tue, Mar 30, 2021 at 12:08 AM John Hubbard wrote: > > On 3/29/21 12:38 PM, Yang Shi wrote: > > Since commit 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of > > FOLL_SPLIT") > > and commit ba925fa35057 ("s390/gmap: improve THP splitting") FOLL_

[PATCH] mm: gup: remove FOLL_SPLIT

2021-03-29 Thread Yang Shi
Since commit 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT") and commit ba925fa35057 ("s390/gmap: improve THP splitting") FOLL_SPLIT has not been used anymore. Remove the dead code. Signed-off-by: Yang Shi --- include/linux/mm.h | 1 - mm/

[PATCH 5/6] mm: migrate: don't split THP for misplaced NUMA page

2021-03-29 Thread Yang Shi
The old behavior didn't split THP if migration is failed due to lack of memory on the target node. But the THP migration does split THP, so keep the old behavior for misplaced NUMA page migration. Signed-off-by: Yang Shi --- mm/migrate.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion

[PATCH 4/6] mm: thp: refactor NUMA fault handling

2021-03-29 Thread Yang Shi
is not required anymore to avoid the race. The page refcount elevation when holding ptl should prevent from THP split. Signed-off-by: Yang Shi --- include/linux/migrate.h | 23 -- mm/huge_memory.c| 132 -- mm/migrate.c| 173

[PATCH 6/6] mm: migrate: remove redundant page count check for THP

2021-03-29 Thread Yang Shi
Don't have to keep the redundant page count check for THP anymore after switching to use generic migration code. Signed-off-by: Yang Shi --- mm/migrate.c | 12 1 file changed, 12 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 1c0c873375ab..328f76848d6c 100644 --- a/mm

[PATCH 2/6] mm: memory: make numa_migrate_prep() non-static

2021-03-29 Thread Yang Shi
The numa_migrate_prep() will be used by huge NUMA fault as well in the following patch, make it non-static. Signed-off-by: Yang Shi --- mm/internal.h | 3 +++ mm/memory.c | 5 ++--- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 1432feec62df

[RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-03-29 Thread Yang Shi
saw there were some hacks about gup from git history, but I didn't figure out if they have been removed or not since I just found FOLL_NUMA code in the current gup implementation and they seems useful. Yang Shi (6): mm: memory: add orig_pmd to struct vm_fault mm: memory: make

[PATCH 1/6] mm: memory: add orig_pmd to struct vm_fault

2021-03-29 Thread Yang Shi
Add orig_pmd to struct vm_fault so the "orig_pmd" parameter used by huge page fault could be removed, just like its PTE counterpart does. Signed-off-by: Yang Shi --- include/linux/huge_mm.h | 9 - include/linux/mm.h | 1 + mm/huge_memory.c| 9 ++--- m

[PATCH 3/6] mm: migrate: teach migrate_misplaced_page() about THP

2021-03-29 Thread Yang Shi
In the following patch the migrate_misplaced_page() will be used to migrate THP for NUMA faul too. Prepare to deal with THP. Signed-off-by: Yang Shi --- include/linux/migrate.h | 6 -- mm/memory.c | 2 +- mm/migrate.c| 2 +- 3 files changed, 6 insertions(+), 4

Re: [PATCH v3 1/5] mm/migrate.c: make putback_movable_page() static

2021-03-25 Thread Yang Shi
and remove all the 3 VM_BUG_ON_PAGE(). Reviewed-by: Yang Shi > > Signed-off-by: Miaohe Lin > --- > include/linux/migrate.h | 1 - > mm/migrate.c| 7 +-- > 2 files changed, 1 insertion(+), 7 deletions(-) > > diff --git a/include/linux/migrate.h b/inclu

Re: [PATCH v2 5/5] mm/migrate.c: fix potential deadlock in NUMA balancing shared exec THP case

2021-03-23 Thread Yang Shi
On Tue, Mar 23, 2021 at 10:17 AM Yang Shi wrote: > > On Tue, Mar 23, 2021 at 6:55 AM Miaohe Lin wrote: > > > > Since commit c77c5cbafe54 ("mm: migrate: skip shared exec THP for NUMA > > balancing"), the NUMA balancing would skip shared exec transh

Re: [PATCH v2 1/5] mm/migrate.c: remove unnecessary VM_BUG_ON_PAGE on putback_movable_page()

2021-03-23 Thread Yang Shi
On Tue, Mar 23, 2021 at 6:54 AM Miaohe Lin wrote: > > The !PageLocked() check is implicitly done in PageMovable(). Remove this > explicit one. TBH, I'm a little bit reluctant to have this kind change. If "locked" check is necessary we'd better make it explicit otherwise just remove it. And why

Re: [PATCH v2 2/5] mm/migrate.c: remove unnecessary rc != MIGRATEPAGE_SUCCESS check in 'else' case

2021-03-23 Thread Yang Shi
On Tue, Mar 23, 2021 at 6:54 AM Miaohe Lin wrote: > > It's guaranteed that in the 'else' case of the rc == MIGRATEPAGE_SUCCESS > check, rc does not equal to MIGRATEPAGE_SUCCESS. Remove this unnecessary > check. Reviewed-by: Yang Shi > > Reviewed-by: David Hildenbrand >

Re: [PATCH v2 5/5] mm/migrate.c: fix potential deadlock in NUMA balancing shared exec THP case

2021-03-23 Thread Yang Shi
the first place. Your fix is correct, and please add the above justification to your commit log. Reviewed-by: Yang Shi > > Fixes: c77c5cbafe54 ("mm: migrate: skip shared exec THP for NUMA balancing") > Signed-off-by: Miaohe Lin > --- > mm/migrate.c | 4 > 1 file cha

Re: [PATCH v5 1/2] mm: huge_memory: a new debugfs interface for splitting THP tests.

2021-03-22 Thread Yang Shi
On Sun, Mar 21, 2021 at 7:11 PM Zi Yan wrote: > > On 19 Mar 2021, at 19:37, Yang Shi wrote: > > > On Thu, Mar 18, 2021 at 5:52 PM Zi Yan wrote: > >> > >> From: Zi Yan > >> > >> We did not have a direct user interface of splitting the compound

Re: [PATCH v5 2/2] mm: huge_memory: debugfs for file-backed THP split.

2021-03-19 Thread Yang Shi
put_page(fpage); > + } > + > + filp_close(candidate, NULL); > + ret = 0; > + > + pr_info("%lu of %lu file-backed THP split\n", split, total); > +out: > + putname(file); > + return ret; > +} > + > +#define MAX_INPUT

Re: [PATCH v5 1/2] mm: huge_memory: a new debugfs interface for splitting THP tests.

2021-03-19 Thread Yang Shi
elftests/vm to utilize the interface by splitting > PMD THPs and PTE-mapped THPs. > > This does not change the old behavior, i.e., writing 1 to the interface > to split all THPs in the system. > > Changelog: > > From v5: > 1. Skipped special VMAs and other fixes. (sugge

Re: [PATCH v4 2/2] mm: huge_memory: debugfs for file-backed THP split.

2021-03-17 Thread Yang Shi
On Wed, Mar 17, 2021 at 8:00 AM Zi Yan wrote: > > On 16 Mar 2021, at 19:18, Yang Shi wrote: > > > On Mon, Mar 15, 2021 at 1:34 PM Zi Yan wrote: > >> > >> From: Zi Yan > >> > >> Further extend /split_huge_pages to accept > >> ",

Re: [PATCH v4 2/2] mm: huge_memory: debugfs for file-backed THP split.

2021-03-16 Thread Yang Shi
On Mon, Mar 15, 2021 at 1:34 PM Zi Yan wrote: > > From: Zi Yan > > Further extend /split_huge_pages to accept > ",," for file-backed THP split tests since > tmpfs may have file backed by THP that mapped nowhere. > > Update selftest program to test file-backed THP split too. > > Suggested-by:

Re: [PATCH v4 1/2] mm: huge_memory: a new debugfs interface for splitting THP tests.

2021-03-16 Thread Yang Shi
given pid code to a separate >function. > 2. Added the missing put_page for not split pages. > 3. pr_debug -> pr_info, make reading results simpler. > > From v2: > > 1. Reused existing /split_huge_pages interface. (suggested by >Yang Shi) > > From v1: &

Re: [PATCH v3 0/4] mm/slub: Fix count_partial() problem

2021-03-15 Thread Yang Shi
On Mon, Mar 15, 2021 at 12:15 PM Roman Gushchin wrote: > > > On Mon, Mar 15, 2021 at 07:49:57PM +0100, Vlastimil Babka wrote: > > On 3/9/21 4:25 PM, Xunlei Pang wrote: > > > count_partial() can hold n->list_lock spinlock for quite long, which > > > makes much trouble to the system. This series

Re: [PATCH v3] mm: huge_memory: a new debugfs interface for splitting THP tests.

2021-03-15 Thread Yang Shi
On Mon, Mar 15, 2021 at 11:37 AM Zi Yan wrote: > > On 15 Mar 2021, at 8:07, Kirill A. Shutemov wrote: > > > On Thu, Mar 11, 2021 at 07:57:12PM -0500, Zi Yan wrote: > >> From: Zi Yan > >> > >> We do not have a direct user interface of splitting the compound page > >> backing a THP > > > > But we

Re: [PATCH v1 00/14] Multigenerational LRU

2021-03-15 Thread Yang Shi
On Fri, Mar 12, 2021 at 11:57 PM Yu Zhao wrote: > > TLDR > > The current page reclaim is too expensive in terms of CPU usage and > often making poor choices about what to evict. We would like to offer > a performant, versatile and straightforward augment. > > Repo > > git fetch

Re: [PATCH v2] mm: huge_memory: a new debugfs interface for splitting THP tests.

2021-03-11 Thread Yang Shi
On Thu, Mar 11, 2021 at 7:52 AM Zi Yan wrote: > > On 10 Mar 2021, at 20:12, Yang Shi wrote: > > > On Wed, Mar 10, 2021 at 7:36 AM Zi Yan wrote: > >> > >> From: Zi Yan > >> > >> We do not have a direct user interface of splitting the compound

[v10 PATCH 13/13] mm: vmscan: shrink deferred objects proportional to priority

2021-03-11 Thread Yang Shi
patch: https://lore.kernel.org/linux-xfs/20191031234618.15403-13-da...@fromorbit.com/ Tested with kernel build and vfs metadata heavy workload in our production environment, no regression is spotted so far. Signed-off-by: Yang Shi --- mm/vmscan.c | 46

[v10 PATCH 12/13] mm: memcontrol: reparent nr_deferred when memcg offline

2021-03-11 Thread Yang Shi
Now shrinker's nr_deferred is per memcg for memcg aware shrinkers, add to parent's corresponding nr_deferred when memcg offline. Acked-by: Vlastimil Babka Acked-by: Kirill Tkhai Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- include/linux/memcontrol.h | 1

[v10 PATCH 11/13] mm: vmscan: don't need allocate shrinker->nr_deferred for memcg aware shrinkers

2021-03-11 Thread Yang Shi
r's SHRINKER_MEMCG_AWARE flag would be cleared. This makes the implementation of this patch simpler. Acked-by: Vlastimil Babka Reviewed-by: Kirill Tkhai Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- mm/vmscan.c | 31 --- 1 file changed, 16 inserti

[v10 PATCH 09/13] mm: vmscan: add per memcg shrinker nr_deferred

2021-03-11 Thread Yang Shi
When memcg is not enabled (!CONFIG_MEMCG or memcg disabled), the shrinker's nr_deferred would be used. And non memcg aware shrinkers use shrinker's nr_deferred all the time. Acked-by: Roman Gushchin Acked-by: Kirill Tkhai Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- include/linux/memco

[v10 PATCH 10/13] mm: vmscan: use per memcg nr_deferred of shrinker

2021-03-11 Thread Yang Shi
Signed-off-by: Yang Shi --- mm/vmscan.c | 78 - 1 file changed, 66 insertions(+), 12 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 397f3b67bad8..5bc6975cb635 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -376,6 +376,24 @@ static void

[v10 PATCH 07/13] mm: vmscan: add shrinker_info_protected() helper

2021-03-11 Thread Yang Shi
xtract the dereference into a helper to make the code more readable. No functional change. Acked-by: Roman Gushchin Acked-by: Kirill Tkhai Acked-by: Vlastimil Babka Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- mm/vmscan.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-)

[v10 PATCH 08/13] mm: vmscan: use a new flag to indicate shrinker is registered

2021-03-11 Thread Yang Shi
would prevent the shrinkers from unregistering correctly. Remove SHRINKER_REGISTERING since we could check if shrinker is registered successfully by the new flag. Acked-by: Kirill Tkhai Acked-by: Vlastimil Babka Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- include

[v10 PATCH 06/13] mm: memcontrol: rename shrinker_map to shrinker_info

2021-03-11 Thread Yang Shi
he "memcg_" prefix. Acked-by: Vlastimil Babka Acked-by: Kirill Tkhai Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- include/linux/memcontrol.h | 8 +++--- mm/memcontrol.c| 6 ++-- mm/vmscan.c| 58 +++

[v10 PATCH 05/13] mm: vmscan: use kvfree_rcu instead of call_rcu

2021-03-11 Thread Yang Shi
Using kvfree_rcu() to free the old shrinker_maps instead of call_rcu(). We don't have to define a dedicated callback for call_rcu() anymore. Acked-by: Roman Gushchin Acked-by: Kirill Tkhai Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- mm/vmscan.c | 7 +-- 1 file changed, 1

[v10 PATCH 04/13] mm: vmscan: remove memcg_shrinker_map_size

2021-03-11 Thread Yang Shi
-by: Roman Gushchin Acked-by: Vlastimil Babka Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- mm/vmscan.c | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index b08c8d9055ae..641a0b8b4ea9 100644 --- a/mm/vmscan.c +++ b/mm

[v10 PATCH 03/13] mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation

2021-03-11 Thread Yang Shi
larity. And a test with heavy paging workload didn't show write lock makes things worse. Acked-by: Vlastimil Babka Acked-by: Kirill Tkhai Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- mm/vmscan.c | 18 -- 1 file changed, 8 insertions(+), 10 del

[v10 PATCH 02/13] mm: vmscan: consolidate shrinker_maps handling code

2021-03-11 Thread Yang Shi
for tighter integration with shrinker code, and remove the "memcg_" prefix. There is no functional change. Acked-by: Vlastimil Babka Acked-by: Kirill Tkhai Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- include/linux/memcontrol.h | 11 ++-- mm/hug

[v10 PATCH 00/13] Make shrinker's nr_deferred memcg aware

2021-03-11 Thread Yang Shi
mcg needs ~320 bytes. 10K memcgs would need ~3.2MB memory. It seems fine. We have been running the patched kernel on some hosts of our fleet (test and production) for months, it works very well. The monitor data shows the working set is sustained as expected. Yang Shi (13): mm: vmscan: use n

[v10 PATCH 01/13] mm: vmscan: use nid from shrink_control for tracepoint

2021-03-11 Thread Yang Shi
. It seems confusing. And the following patch will remove using nid directly in do_shrink_slab(), this patch also helps cleanup the code. Acked-by: Vlastimil Babka Acked-by: Kirill Tkhai Reviewed-by: Shakeel Butt Acked-by: Roman Gushchin Signed-off-by: Yang Shi --- mm/vmscan.c | 2 +- 1 file

Re: [PATCH v2] mm: huge_memory: a new debugfs interface for splitting THP tests.

2021-03-10 Thread Yang Shi
On Wed, Mar 10, 2021 at 7:36 AM Zi Yan wrote: > > From: Zi Yan > > We do not have a direct user interface of splitting the compound page > backing a THP and there is no need unless we want to expose the THP > implementation details to users. Adding an interface for debugging. > > By writing ",,"

Re: [v9 PATCH 13/13] mm: vmscan: shrink deferred objects proportional to priority

2021-03-10 Thread Yang Shi
On Wed, Mar 10, 2021 at 2:41 PM Shakeel Butt wrote: > > On Wed, Mar 10, 2021 at 1:41 PM Yang Shi wrote: > > > > On Wed, Mar 10, 2021 at 1:08 PM Shakeel Butt wrote: > > > > > > On Wed, Mar 10, 2021 at 10:54 AM Yang Shi wrote: > > > > >

Re: [v9 PATCH 13/13] mm: vmscan: shrink deferred objects proportional to priority

2021-03-10 Thread Yang Shi
On Wed, Mar 10, 2021 at 1:08 PM Shakeel Butt wrote: > > On Wed, Mar 10, 2021 at 10:54 AM Yang Shi wrote: > > > > On Wed, Mar 10, 2021 at 10:24 AM Shakeel Butt wrote: > > > > > > On Wed, Mar 10, 2021 at 9:46 AM Yang Shi wrote: > > > > > &g

Re: [v9 PATCH 13/13] mm: vmscan: shrink deferred objects proportional to priority

2021-03-10 Thread Yang Shi
On Wed, Mar 10, 2021 at 10:24 AM Shakeel Butt wrote: > > On Wed, Mar 10, 2021 at 9:46 AM Yang Shi wrote: > > > > The number of deferred objects might get windup to an absurd number, and it > > results in clamp of slab objects. It is undesirable for sustaining > >

[v9 PATCH 13/13] mm: vmscan: shrink deferred objects proportional to priority

2021-03-10 Thread Yang Shi
patch: https://lore.kernel.org/linux-xfs/20191031234618.15403-13-da...@fromorbit.com/ Tested with kernel build and vfs metadata heavy workload in our production environment, no regression is spotted so far. Signed-off-by: Yang Shi --- mm/vmscan.c | 46

[v9 PATCH 11/13] mm: vmscan: don't need allocate shrinker->nr_deferred for memcg aware shrinkers

2021-03-10 Thread Yang Shi
r's SHRINKER_MEMCG_AWARE flag would be cleared. This makes the implementation of this patch simpler. Acked-by: Vlastimil Babka Reviewed-by: Kirill Tkhai Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- mm/vmscan.c | 31 --- 1 file changed, 16 inserti

[v9 PATCH 12/13] mm: memcontrol: reparent nr_deferred when memcg offline

2021-03-10 Thread Yang Shi
Now shrinker's nr_deferred is per memcg for memcg aware shrinkers, add to parent's corresponding nr_deferred when memcg offline. Acked-by: Vlastimil Babka Acked-by: Kirill Tkhai Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- include/linux/memcontrol.h | 1

[v9 PATCH 10/13] mm: vmscan: use per memcg nr_deferred of shrinker

2021-03-10 Thread Yang Shi
Signed-off-by: Yang Shi --- mm/vmscan.c | 78 - 1 file changed, 66 insertions(+), 12 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index ae82afe6cec6..326f0e0c4356 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -374,6 +374,24 @@ static void

[v9 PATCH 08/13] mm: vmscan: use a new flag to indicate shrinker is registered

2021-03-10 Thread Yang Shi
would prevent the shrinkers from unregistering correctly. Remove SHRINKER_REGISTERING since we could check if shrinker is registered successfully by the new flag. Acked-by: Kirill Tkhai Acked-by: Vlastimil Babka Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- include

[v9 PATCH 09/13] mm: vmscan: add per memcg shrinker nr_deferred

2021-03-10 Thread Yang Shi
When memcg is not enabled (!CONFIG_MEMCG or memcg disabled), the shrinker's nr_deferred would be used. And non memcg aware shrinkers use shrinker's nr_deferred all the time. Acked-by: Roman Gushchin Acked-by: Kirill Tkhai Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- include/linux/memco

[v9 PATCH 06/13] mm: memcontrol: rename shrinker_map to shrinker_info

2021-03-10 Thread Yang Shi
he "memcg_" prefix. Acked-by: Vlastimil Babka Acked-by: Kirill Tkhai Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- include/linux/memcontrol.h | 8 +++--- mm/memcontrol.c| 6 ++-- mm/vmscan.c| 58 +++

[v9 PATCH 07/13] mm: vmscan: add shrinker_info_protected() helper

2021-03-10 Thread Yang Shi
xtract the dereference into a helper to make the code more readable. No functional change. Acked-by: Roman Gushchin Acked-by: Kirill Tkhai Acked-by: Vlastimil Babka Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- mm/vmscan.c | 15 ++- 1 file changed, 10 insertions(+), 5 del

[v9 PATCH 04/13] mm: vmscan: remove memcg_shrinker_map_size

2021-03-10 Thread Yang Shi
-by: Roman Gushchin Acked-by: Vlastimil Babka Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- mm/vmscan.c | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 75fd8038a6c8..bda67e1ac84b 100644 --- a/mm/vmscan.c +++ b/mm

[v9 PATCH 05/13] mm: vmscan: use kvfree_rcu instead of call_rcu

2021-03-10 Thread Yang Shi
Using kvfree_rcu() to free the old shrinker_maps instead of call_rcu(). We don't have to define a dedicated callback for call_rcu() anymore. Acked-by: Roman Gushchin Acked-by: Kirill Tkhai Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- mm/vmscan.c | 7 +-- 1 file changed, 1

[v9 PATCH 03/13] mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation

2021-03-10 Thread Yang Shi
larity. And a test with heavy paging workload didn't show write lock makes things worse. Acked-by: Vlastimil Babka Acked-by: Kirill Tkhai Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- mm/vmscan.c | 18 -- 1 file changed, 8 insertions(+), 10 del

[v9 PATCH 02/13] mm: vmscan: consolidate shrinker_maps handling code

2021-03-10 Thread Yang Shi
for tighter integration with shrinker code, and remove the "memcg_" prefix. There is no functional change. Acked-by: Vlastimil Babka Acked-by: Kirill Tkhai Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt Signed-off-by: Yang Shi --- include/linux/memcontrol.h | 11 ++-- mm/hug

[v9 PATCH 00/13] Make shrinker's nr_deferred memcg aware

2021-03-10 Thread Yang Shi
nd production) for months, it works very well. The monitor data shows the working set is sustained as expected. Yang Shi (13): mm: vmscan: use nid from shrink_control for tracepoint mm: vmscan: consolidate shrinker_maps handling code mm: vmscan: use shrinker_rwsem to protect

[v9 PATCH 01/13] mm: vmscan: use nid from shrink_control for tracepoint

2021-03-10 Thread Yang Shi
. It seems confusing. And the following patch will remove using nid directly in do_shrink_slab(), this patch also helps cleanup the code. Acked-by: Vlastimil Babka Acked-by: Kirill Tkhai Reviewed-by: Shakeel Butt Acked-by: Roman Gushchin Signed-off-by: Yang Shi --- mm/vmscan.c | 2 +- 1 file

Re: [PATCH 00/10] [v6] Migrate Pages in lieu of discard

2021-03-08 Thread Yang Shi
t; > The meat of this patch is in: > > [PATCH 05/10] mm/migrate: demote pages during reclaim > > Which also has the most changes since the last post. This version is > mostly to address review comments from Yang Shi and Oscar Salvador. > Review comments are documented in the individua

Re: [PATCH 10/10] mm/migrate: new zone_reclaim_mode to enable reclaim migration

2021-03-08 Thread Yang Shi
the guarantees. I think we'd better have the cpuset violation paragraph along with new zone reclaim mode text so that the users are aware of the potential violation. I don't think commit log is the to-go place for any plain users. > > Signed-off-by: Dave Hansen > Cc: Yang Shi > C

Re: [PATCH 09/10] mm/vmscan: never demote for memcg reclaim

2021-03-08 Thread Yang Shi
al is to reduce the > total memory consumption of the entire memcg, across all > nodes. Migration does not assist memcg reclaim because > it just moves page contents between nodes rather than > actually reducing memory consumption. Reviewed-by: Yang Shi > > Signed-off-by: Dave

Re: [PATCH 08/10] mm/vmscan: Consider anonymous pages without swap

2021-03-08 Thread Yang Shi
/preliminary > which just says whether there is a possibility of future reclaim. Reviewed-by: Yang Shi > > #Signed-off-by: Keith Busch > Cc: Keith Busch > Signed-off-by: Dave Hansen > Cc: Yang Shi > Cc: David Rientjes > Cc: Huang Ying > Cc: Dan Williams > Cc: David Hildenbran

Re: [PATCH 07/10] mm/vmscan: add helper for querying ability to age anonymous pages

2021-03-08 Thread Yang Shi
tal_swap_pages' checks into a helper, give it a > logically significant name, and check for the possibility of page > demotion. Reviewed-by: Yang Shi > > Signed-off-by: Dave Hansen > Cc: David Rientjes > Cc: Huang Ying > Cc: Dan Williams > Cc: David Hildenbrand > Cc: osalvador &

Re: [PATCH 06/10] mm/vmscan: add page demotion counter

2021-03-08 Thread Yang Shi
On Thu, Mar 4, 2021 at 4:01 PM Dave Hansen wrote: > > > From: Yang Shi > > Account the number of demoted pages into reclaim_state->nr_demoted. > > Add pgdemote_kswapd and pgdemote_direct VM counters showed in > /proc/vmstat. > > [ daveh: >- __count_v

  1   2   3   4   5   6   7   8   9   10   >