[PATCH v2 2/2] mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate.

2020-10-30 Thread Zi Yan
From: Zi Yan In isolate_migratepages_block, if we have too many isolated pages and nr_migratepages is not zero, we should try to migrate what we have without wasting time on isolating. Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”) Suggested

Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

2020-10-30 Thread Zi Yan
On 30 Oct 2020, at 10:50, Vlastimil Babka wrote: > On 10/29/20 9:04 PM, Zi Yan wrote: >> From: Zi Yan >> >> In isolate_migratepages_block, when cc->alloc_contig is true, we are >> able to isolate compound pages, nr_migratepages and nr_isolated did not >> cou

Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

2020-10-30 Thread Zi Yan
On 30 Oct 2020, at 10:49, Michal Hocko wrote: > On Fri 30-10-20 10:35:43, Zi Yan wrote: >> On 30 Oct 2020, at 9:36, Michal Hocko wrote: >> >>> On Fri 30-10-20 08:20:50, Zi Yan wrote: >>>> On 30 Oct 2020, at 5:43, Michal Hocko wrote: >>>> >>&

Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

2020-10-30 Thread Zi Yan
On 30 Oct 2020, at 9:36, Michal Hocko wrote: > On Fri 30-10-20 08:20:50, Zi Yan wrote: >> On 30 Oct 2020, at 5:43, Michal Hocko wrote: >> >>> [Cc Vlastimil] >>> >>> On Thu 29-10-20 16:04:35, Zi Yan wrote: >>>> From: Zi Yan >>>>

Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

2020-10-30 Thread Zi Yan
On 30 Oct 2020, at 5:43, Michal Hocko wrote: > [Cc Vlastimil] > > On Thu 29-10-20 16:04:35, Zi Yan wrote: >> From: Zi Yan >> >> In isolate_migratepages_block, when cc->alloc_contig is true, we are >> able to isolate compound pages, nr_migratepages and nr_isol

Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

2020-10-29 Thread Zi Yan
On 29 Oct 2020, at 20:28, Andrew Morton wrote: > On Thu, 29 Oct 2020 17:31:28 -0400 Zi Yan wrote: > >>> >>> Shall you add Fixes tag to commit >>> 1da2f328fa643bd72197dfed0c655148af31e4eb? And may cc stable. >> >> Sure. >> >> Fixes:

Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation.

2020-10-29 Thread Zi Yan
On 29 Oct 2020, at 17:14, Yang Shi wrote: > On Thu, Oct 29, 2020 at 1:04 PM Zi Yan wrote: >> >> From: Zi Yan >> >> In isolate_migratepages_block, when cc->alloc_contig is true, we are >> able to isolate compound pages, nr_migratepages and nr_isolated did n

[PATCH] mm/compaction: count pages and stop correctly during page isolation.

2020-10-29 Thread Zi Yan
From: Zi Yan In isolate_migratepages_block, when cc->alloc_contig is true, we are able to isolate compound pages, nr_migratepages and nr_isolated did not count compound pages correctly, causing us to isolate more pages than we thought. Use thp_nr_pages to count pages. Otherwise, we mi

Re: [PATCH v1 0/2] mm: cma: introduce a non-blocking version of cma_release()

2020-10-22 Thread Zi Yan
On 22 Oct 2020, at 20:47, Roman Gushchin wrote: > On Thu, Oct 22, 2020 at 07:42:45PM -0400, Zi Yan wrote: >> On 22 Oct 2020, at 18:53, Roman Gushchin wrote: >> >>> This small patchset introduces a non-blocking version of cma_release() >>> and simplifies the code

Re: [PATCH v1 0/2] mm: cma: introduce a non-blocking version of cma_release()

2020-10-22 Thread Zi Yan
On 22 Oct 2020, at 18:53, Roman Gushchin wrote: > This small patchset introduces a non-blocking version of cma_release() > and simplifies the code in hugetlbfs, where previously we had to > temporarily drop hugetlb_lock around the cma_release() call. > > It should help Zi Yan on h

Re: [RFC PATCH v2 00/30] 1GB PUD THP support on x86_64

2020-10-05 Thread Zi Yan
On 5 Oct 2020, at 11:55, Matthew Wilcox wrote: > On Mon, Oct 05, 2020 at 11:03:56AM -0400, Zi Yan wrote: >> On 2 Oct 2020, at 4:30, David Hildenbrand wrote: >>> Yes, I think one important feature would be that we don't end up placing >>> a gigantic page where only a ha

Re: [RFC PATCH v2 00/30] 1GB PUD THP support on x86_64

2020-10-05 Thread Zi Yan
On 5 Oct 2020, at 13:39, David Hildenbrand wrote: consideting that 2MB THP have turned out to be quite a pain but situation has settled over time. Maybe our current code base is prepared for that much better. >> >> I am planning to refactor my code further to reduce the amount of

Re: [RFC PATCH v2 00/30] 1GB PUD THP support on x86_64

2020-10-05 Thread Zi Yan
On 2 Oct 2020, at 3:50, David Hildenbrand wrote: - huge page sizes controllable by the userspace? >>> >>> It might be good to allow advanced users to choose the page sizes, so they >>> have better control of their applications. >> >> Could you elaborate more? Those advanced users can use

Re: [RFC PATCH v2 00/30] 1GB PUD THP support on x86_64

2020-10-05 Thread Zi Yan
On 2 Oct 2020, at 4:30, David Hildenbrand wrote: > On 02.10.20 10:10, Michal Hocko wrote: >> On Fri 02-10-20 09:50:02, David Hildenbrand wrote: >> - huge page sizes controllable by the userspace? > > It might be good to allow advanced users to choose the page sizes, so they > have

Re: [RFC PATCH v2 00/30] 1GB PUD THP support on x86_64

2020-10-01 Thread Zi Yan
On 30 Sep 2020, at 7:55, Michal Hocko wrote: > On Mon 28-09-20 13:53:58, Zi Yan wrote: >> From: Zi Yan >> >> Hi all, >> >> This patchset adds support for 1GB PUD THP on x86_64. It is on top of >> v5.9-rc5-mmots-2020-09-18-21-23. It is also available at: &g

Re: [RFC PATCH v2 03/30] mm: thp: use single linked list for THP page table page deposit.

2020-09-28 Thread Zi Yan
On 28 Sep 2020, at 15:34, Matthew Wilcox wrote: > On Mon, Sep 28, 2020 at 01:54:01PM -0400, Zi Yan wrote: >> struct {/* Page table pages */ >> -unsigned long _pt_pad_1;/* compound_head */ >> -pgtable_t pmd_h

[RFC PATCH v2 24/30] mm: madvise: add page size options to MADV_HUGEPAGE and MADV_NOHUGEPAGE.

2020-09-28 Thread Zi Yan
From: Zi Yan It allows user to specify up to what page size kernel will generate THPs to back up the memory range in madvise. Because we now have PMD and PUD THPs, they require different amount of kernel effort to be generated, and we want to prevent user from getting long page fault latency

[RFC PATCH v2 19/30] mm: stats: make smap stats understand PUD THPs.

2020-09-28 Thread Zi Yan
From: Zi Yan Signed-off-by: Zi Yan --- fs/proc/task_mmu.c | 68 ++ 1 file changed, 63 insertions(+), 5 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index a21484b1414d..077196182288 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc

[RFC PATCH v2 26/30] mm: thp: add a global knob to enable/disable PUD THPs.

2020-09-28 Thread Zi Yan
From: Zi Yan Like the existing global PMD THP knob, it allows user to enable/disable PUD THPs. PUD THP is disabled by default unless user knows the performance tradeoff of using it, like longer first time page fault due to larger page zeroing and longer page allocation time when memory

[RFC PATCH v2 10/30] fs: proc: add PUD THP kpageflag.

2020-09-28 Thread Zi Yan
From: Zi Yan Bit 27 is used to identify PUD THP. Signed-off-by: Zi Yan --- fs/proc/page.c | 2 ++ include/uapi/linux/kernel-page-flags.h | 1 + 2 files changed, 3 insertions(+) diff --git a/fs/proc/page.c b/fs/proc/page.c index f3b39a7d2bf3..e4e2ad3612c9 100644

[RFC PATCH v2 00/30] 1GB PUD THP support on x86_64

2020-09-28 Thread Zi Yan
From: Zi Yan Hi all, This patchset adds support for 1GB PUD THP on x86_64. It is on top of v5.9-rc5-mmots-2020-09-18-21-23. It is also available at: https://github.com/x-y-z/linux-1gb-thp/tree/1gb_thp_v5.9-rc5-mmots-2020-09-18-21-23 Other than PUD THP, we had some discussion on generating THPs

[RFC PATCH v2 13/30] mm: rmap: add map_order to page_remove_anon_compound_rmap.

2020-09-28 Thread Zi Yan
From: Zi Yan When PMD-mapped PUD THP is enabled by the upcoming commits, we can unmap a PMD-mapped PUD THP that should be counted as NR_ANON_THPS. The added map_order tells us about this situation. Signed-off-by: Zi Yan --- mm/rmap.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions

[RFC PATCH v2 08/30] mm: thp: add PUD THP support for copy_huge_pud.

2020-09-28 Thread Zi Yan
From: Zi Yan copy_huge_pud needs to allocate 1 PMD page table page and 512 PTE page table pages and deposit them when copying a PUD THP. It is similar to what we do at PUD THP page faults. Signed-off-by: Zi Yan --- mm/huge_memory.c | 36 1 file changed, 28

[RFC PATCH v2 23/30] mm: support PUD THP pagemap support.

2020-09-28 Thread Zi Yan
From: Zi Yan pagemap_pud_range is added to print pud page flags properly. Signed-off-by: Zi Yan --- fs/proc/task_mmu.c | 63 ++ 1 file changed, 63 insertions(+) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 077196182288..04a3158d0d5b

[RFC PATCH v2 28/30] hugetlb: cma: move cma reserve function to cma.c.

2020-09-28 Thread Zi Yan
From: Zi Yan It will be used by other allocations, like 1GB THP allocation in the upcoming commit. Signed-off-by: Zi Yan --- .../admin-guide/kernel-parameters.txt | 2 +- arch/arm64/mm/hugetlbpage.c | 2 +- arch/powerpc/mm/hugetlbpage.c | 2

[RFC PATCH v2 20/30] mm: page_vma_walk: teach it about PMD-mapped PUD THP.

2020-09-28 Thread Zi Yan
From: Zi Yan We now have PMD-mapped PUD THP and PTE-mapped PUD THP, page_vma_walk should handle them properly. Signed-off-by: Zi Yan --- mm/page_vma_mapped.c | 152 +-- 1 file changed, 118 insertions(+), 34 deletions(-) diff --git a/mm

[RFC PATCH v2 22/30] mm: thp: split PUD THPs at page reclaim.

2020-09-28 Thread Zi Yan
From: Zi Yan We cannot swap PUD THPs, so split them before swap them out. PUD THPs will be split into PMD THPs, so that if THP_SWAP is enabled, PMD THPs can be swapped out as a whole. Signed-off-by: Zi Yan --- mm/swap_slots.c | 2 ++ mm/vmscan.c | 33 +++-- 2

[RFC PATCH v2 30/30] mm: thp: enable anonymous PUD THP at page fault path.

2020-09-28 Thread Zi Yan
From: Zi Yan All previous commits have anonymous PUD THP support ready, so we can enable anonymous PUD THP page fault now. Signed-off-by: Zi Yan --- mm/memory.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 9f7b509a3aa7..dc285d9872fc

[RFC PATCH v2 15/30] mm: thp: add PUD THP to deferred split list when PUD mapping is gone.

2020-09-28 Thread Zi Yan
From: Zi Yan When PUD mapping is gone, there is no need to keep the PUD THP. Add it to deferred split list, so when memory pressure comes, the THP will be split. Signed-off-by: Zi Yan --- mm/rmap.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/rmap.c b/mm/rmap.c index b4950f7a0978

[RFC PATCH v2 18/30] mm: thp: PUD THP follow_p*d_page() support.

2020-09-28 Thread Zi Yan
From: Zi Yan Add follow_page support for PUD THPs. Signed-off-by: Zi Yan --- include/linux/huge_mm.h | 11 +++ mm/gup.c| 60 - mm/huge_memory.c| 73 - 3 files changed, 142 insertions(+), 2

[RFC PATCH v2 11/30] mm: thp: handling PUD THP reference bit.

2020-09-28 Thread Zi Yan
From: Zi Yan Add PUD-level TLB flush ops and teach page_vma_mapped_talk about PUD THPs. Signed-off-by: Zi Yan --- arch/x86/include/asm/pgtable.h | 3 +++ arch/x86/mm/pgtable.c | 13 + include/linux/mmu_notifier.h | 13 + include/linux/pgtable.h| 14

[RFC PATCH v2 01/30] mm/pagewalk: use READ_ONCE when reading the PUD entry unlocked

2020-09-28 Thread Zi Yan
From: Jason Gunthorpe The pagewalker runs while only holding the mmap_sem for read. The pud can be set asynchronously, while also holding the mmap_sem for read eg from: handle_mm_fault() __handle_mm_fault() create_huge_pmd() dev_dax_huge_fault() __dev_dax_pud_fault()

[RFC PATCH v2 25/30] mm: vma: add VM_HUGEPAGE_PUD to vm_flags at bit 37.

2020-09-28 Thread Zi Yan
From: Zi Yan madvise can set this bit via MADV_HUGEPAGE | MADV_HUGEPAGE_1GB and unset it via MADV_NOHUGEPAGE | MADV_HUGEPAGE_1GB. Later, kernel will check this bit to decide whether to allocate PUD THPs or not on a VMA when the global PUD THP is set to madvise. Signed-off-by: Zi Yan

[RFC PATCH v2 05/30] mm: thp: add page table deposit/withdraw functions for PUD THP.

2020-09-28 Thread Zi Yan
From: Zi Yan We deposit 512 PMD pages, each of which has 512 PTE pages deposited in its ->deposit_head, to mm->deposit_head_pud. They will be withdrawn and used when a PUD THP split into 512 PMD THPs. In this way, when any of the 512 PMD THPs is split further, we will use the existing cod

[RFC PATCH v2 14/30] mm: thp: add PUD THP split_huge_pud_page() function.

2020-09-28 Thread Zi Yan
From: Zi Yan It mimics PMD-level THP split. In addition, to support PMD-mapped PUD THP, PMDPageInPUD() is added to identify the first page in the PMD sized aligned physical pages. For example, in x86_64, the page[0], page[512], page[1024], ... are regarded as PMDPageInPUD. For the mapcount

[RFC PATCH v2 17/30] mm: thp: PUD THP COW splits PUD page and falls back to PMD page.

2020-09-28 Thread Zi Yan
From: Zi Yan COW on PUD THPs has the same behavior as COW on PMD THPs to avoid high COW overhead. As a result, do_huge_pmd_wp will see PMD-mapped PUD THPs, thus needs to count PUD mappings in total mapcount when calling page_trans_huge_map_swapcount in reuse_swap_page to avoid false positive

[RFC PATCH v2 09/30] mm: thp: add PUD THP support to zap_huge_pud.

2020-09-28 Thread Zi Yan
From: Zi Yan Preallocated 513 (1 PMD and 512 PTE) page table pages need to be freed when PUD THP is removed. zap_pud_deposited_table is added to perform the action. Signed-off-by: Zi Yan --- mm/huge_memory.c | 48 +--- 1 file changed, 45 insertions

[RFC PATCH v2 03/30] mm: thp: use single linked list for THP page table page deposit.

2020-09-28 Thread Zi Yan
From: Zi Yan The old design uses the double linked list page->lru to chain all deposited page table pages when creating a THP and page->pmd_huge_pte to point to the first page of the list. As the second pointer in page->lru overlaps with page->pmd_huge_pte, the design prevents mult

[RFC PATCH v2 27/30] mm: thp: make PUD THP size public.

2020-09-28 Thread Zi Yan
From: Zi Yan User can access the PUD THP size via `cat /sys/kernel/mm/transparent_hugepage/hpage_pud_size`. This is similar to make PMD THP size public. Signed-off-by: Zi Yan --- Documentation/admin-guide/mm/transhuge.rst | 1 + mm/huge_memory.c | 13

[RFC PATCH v2 04/30] mm: add new helper functions to allocate one PMD page with 512 PTE pages.

2020-09-28 Thread Zi Yan
From: Zi Yan This prepares for PUD THP support, which allocates 512 of such PMD pages when creating a PUD THP. These page table pages will be withdrawn during THP split. Signed-off-by: Zi Yan --- arch/x86/include/asm/pgalloc.h | 60 ++ arch/x86/mm/pgtable.c

[RFC PATCH v2 16/30] mm: debug: adapt dump_page to PUD THP.

2020-09-28 Thread Zi Yan
From: Zi Yan Since the order of a PUD THP is greater than MAX_ORDER, do not consider its tail pages corrupted. Also print sub_compound_mapcount when dumping a PMDPageInPUD. Signed-off-by: Zi Yan --- mm/debug.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/mm/debug.c

[RFC PATCH v2 02/30] mm: pagewalk: use READ_ONCE when reading the PMD entry unlocked

2020-09-28 Thread Zi Yan
From: Zi Yan The pagewalker runs while only holding the mmap_sem for read. The pud can be set asynchronously, while also holding the mmap_sem for read. This follows the same way as the commit: mm/pagewalk: use READ_ONCE when reading the PUD entry unlocked" Signed-off-by: Zi Yan --- fs

[RFC PATCH v2 06/30] mm: change thp_order and thp_nr as we will have not just PMD THPs.

2020-09-28 Thread Zi Yan
From: Zi Yan As PUD THP is going to be added in the following patches, thp_order and thp_nr can be HPAGE_PUD_ORDER and HPAGE_PUD_NR, respectively. Signed-off-by: Zi Yan --- include/linux/huge_mm.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/huge_mm.h

[RFC PATCH v2 29/30] mm: thp: use cma reservation for pud thp allocation.

2020-09-28 Thread Zi Yan
From: Zi Yan Sharing hugepage_cma reservation with hugetlb for pud thp allocaiton. The reserved cma regions still can be used for moveable page allocations. During 1GB page split, all subpages are cleared from the CMA bitmap, since they are no more 1GB pages and will be freed via the normal

[RFC PATCH v2 12/30] mm: rmap: add mappped/unmapped page order to anonymous page rmap functions.

2020-09-28 Thread Zi Yan
From: Zi Yan page_add_anon_rmap, do_page_add_anon_rmap, page_add_new_anon_rmap, page_remove_rmap are changed to have page order as a parameter. This prepares for PMD-mapped PUD THP, since a PUD THP can be mapped in three different ways, PTEs, PMDs, and PUDs and the original boolean parameter

[RFC PATCH v2 21/30] mm: thp: PUD THP support in try_to_unmap().

2020-09-28 Thread Zi Yan
From: Zi Yan Unmap different subpages in different sized THPs properly in the try_to_unmap() function. pvmw.pte, pvmw.pmd, pvmw.pud are used to identify unmapped page sizes: 1. pvmw.pte != NULL: PTE pages or PageHuge. 2. pvmw.pte == NULL and pvmw.pmd != NULL: PMD pages. 3. pvmw.pte == NULL

[RFC PATCH v2 07/30] mm: thp: add anonymous PUD THP page fault support without enabling it.

2020-09-28 Thread Zi Yan
From: Zi Yan This adds PUD THP support for anonymous pages. Applications will be able to get PUD pages during page faults when their VMAs are larger than PUD page size after the page fault path is enabled. No shared zero PUD THP is created and shared by all read-only zero PUD THPs, different

Re: [PATCH] mm:cleanup mincore_huge_pmd

2020-09-23 Thread Zi Yan
bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, >unsigned long new_addr, >pmd_t *old_pmd, pmd_t *new_pmd); > -- > 2.17.1 LGTM. Reviewed-by: Zi Yan Thanks. — Best Regards, Yan Zi signature.asc Description: OpenPGP digital signature

[PATCH v2] mm/migrate: correct thp migration stats.

2020-09-17 Thread Zi Yan
From: Zi Yan PageTransHuge returns true for both thp and hugetlb, so thp stats was counting both thp and hugetlb migrations. Exclude hugetlb migration by setting is_thp variable right. Clean up thp handling code too when we are there. Fixes: 1a5bae25e3cf ("mm/vmstat: add events fo

Re: [PATCH] mm/migrate: correct thp migration stats.

2020-09-17 Thread Zi Yan
On 17 Sep 2020, at 16:59, Daniel Jordan wrote: > On Thu, Sep 17, 2020 at 04:27:29PM -0400, Zi Yan wrote: >> From: Zi Yan >> >> PageTransHuge returns true for both thp and hugetlb, so thp stats was >> counting both thp and hugetlb migrations. Exclude hugetlb mig

[PATCH] mm/migrate: correct thp migration stats.

2020-09-17 Thread Zi Yan
From: Zi Yan PageTransHuge returns true for both thp and hugetlb, so thp stats was counting both thp and hugetlb migrations. Exclude hugetlb migration by setting is_thp variable right. Fixes: 1a5bae25e3cf ("mm/vmstat: add events for THP migration without split") Signed-off-by:

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-10 Thread Zi Yan
On 10 Sep 2020, at 4:27, David Hildenbrand wrote: > On 10.09.20 09:32, Michal Hocko wrote: >> [Cc Vlastimil and Mel - the whole email thread starts >> http://lkml.kernel.org/r/20200902180628.4052244-1-zi@sent.com >> but this particular subthread has diverged a bit and you might find it >>

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-10 Thread Zi Yan
On 10 Sep 2020, at 9:32, Rik van Riel wrote: > On Thu, 2020-09-10 at 09:32 +0200, Michal Hocko wrote: >> [Cc Vlastimil and Mel - the whole email thread starts >> http://lkml.kernel.org/r/20200902180628.4052244-1-zi@sent.com >> but this particular subthread has diverged a bit and you might

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-10 Thread Zi Yan
On 10 Sep 2020, at 10:34, David Hildenbrand wrote: >>> As long as we stay in safe zone boundaries you get a benefit in most >>> scenarios. As soon as we would have a (temporary) workload that would >>> require more unmovable allocations we would fallback to polluting some >>> pageblocks only. >>

Re: [RFC PATCH 06/16] mm: thp: add 1GB THP split_huge_pud_page() function.

2020-09-09 Thread Zi Yan
On 9 Sep 2020, at 10:18, Kirill A. Shutemov wrote: > On Wed, Sep 02, 2020 at 02:06:18PM -0400, Zi Yan wrote: >> 25 files changed, 852 insertions(+), 98 deletions(-) > > It's way too big to have meaningful review. Will split it into small patches in the next version. — Best

Re: [RFC PATCH 01/16] mm: add pagechain container for storing multiple pages.

2020-09-09 Thread Zi Yan
On 9 Sep 2020, at 9:46, Kirill A. Shutemov wrote: > On Mon, Sep 07, 2020 at 11:11:05AM -0400, Zi Yan wrote: >> On 7 Sep 2020, at 8:22, Kirill A. Shutemov wrote: >> >>> On Wed, Sep 02, 2020 at 02:06:13PM -0400, Zi Yan wrote: >>>> From: Zi Yan >>>&g

Re: [RFC PATCH 05/16] mm: thp: handling 1GB THP reference bit.

2020-09-09 Thread Zi Yan
On 9 Sep 2020, at 10:09, Kirill A. Shutemov wrote: > On Wed, Sep 02, 2020 at 02:06:17PM -0400, Zi Yan wrote: >> From: Zi Yan >> >> Add PUD-level TLB flush ops and teach page_vma_mapped_talk about 1GB >> THPs. >> >> Signed-off-by: Zi Yan >>

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-08 Thread Zi Yan
On 7 Sep 2020, at 3:20, Michal Hocko wrote: > On Fri 04-09-20 14:10:45, Roman Gushchin wrote: >> On Fri, Sep 04, 2020 at 09:42:07AM +0200, Michal Hocko wrote: > [...] >>> An explicit opt-in sounds much more appropriate to me as well. If we go >>> with a specific API then I would not make it 1GB

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-08 Thread Zi Yan
On 8 Sep 2020, at 10:22, David Hildenbrand wrote: > On 08.09.20 16:05, Zi Yan wrote: >> On 8 Sep 2020, at 7:57, David Hildenbrand wrote: >> >>> On 03.09.20 18:30, Roman Gushchin wrote: >>>> On Thu, Sep 03, 2020 at 05:23:00PM +0300, Kirill A. Shutemov wrote:

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-08 Thread Zi Yan
On 8 Sep 2020, at 10:27, Matthew Wilcox wrote: > On Tue, Sep 08, 2020 at 10:05:11AM -0400, Zi Yan wrote: >> On 8 Sep 2020, at 7:57, David Hildenbrand wrote: >>> I have concerns if we would silently use 1~GB THPs in most scenarios >>> where be would have used 2~MB TH

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-08 Thread Zi Yan
On 8 Sep 2020, at 7:57, David Hildenbrand wrote: > On 03.09.20 18:30, Roman Gushchin wrote: >> On Thu, Sep 03, 2020 at 05:23:00PM +0300, Kirill A. Shutemov wrote: >>> On Wed, Sep 02, 2020 at 02:06:12PM -0400, Zi Yan wrote: >>>> From: Zi Yan >>>> >>

Re: [RFC PATCH 01/16] mm: add pagechain container for storing multiple pages.

2020-09-07 Thread Zi Yan
On 7 Sep 2020, at 8:22, Kirill A. Shutemov wrote: > On Wed, Sep 02, 2020 at 02:06:13PM -0400, Zi Yan wrote: >> From: Zi Yan >> >> When depositing page table pages for 1GB THPs, we need 512 PTE pages + >> 1 PMD page. Instead of counting and depositing 513 pages,

Re: [PATCH v2 1/7] mm/thp: fix __split_huge_pmd_locked() for migration PMD

2020-09-02 Thread Zi Yan
MD entry is a migration PMD entry, the call to > is_huge_zero_pmd(*pmd) is incorrect because it calls pmd_pfn(pmd) instead > of migration_entry_to_pfn(pmd_to_swp_entry(pmd)). > Fix these problems by checking for a PMD migration entry. > > Signed-off-by: Ralph Campbell Thanks for th

Re: [RFC PATCH 01/16] mm: add pagechain container for storing multiple pages.

2020-09-02 Thread Zi Yan
On 2 Sep 2020, at 16:29, Randy Dunlap wrote: > On 9/2/20 11:06 AM, Zi Yan wrote: >> From: Zi Yan >> >> When depositing page table pages for 1GB THPs, we need 512 PTE pages + >> 1 PMD page. Instead of counting and depositing 513 pages, we can use the >> PM

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-02 Thread Zi Yan
On 2 Sep 2020, at 15:57, Jason Gunthorpe wrote: > On Wed, Sep 02, 2020 at 03:05:39PM -0400, Zi Yan wrote: >> On 2 Sep 2020, at 14:48, Jason Gunthorpe wrote: >> >>> On Wed, Sep 02, 2020 at 02:45:37PM -0400, Zi Yan wrote: >>> >>>>> Surprised this does

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-02 Thread Zi Yan
On 2 Sep 2020, at 14:48, Jason Gunthorpe wrote: > On Wed, Sep 02, 2020 at 02:45:37PM -0400, Zi Yan wrote: > >>> Surprised this doesn't touch mm/pagewalk.c ? >> >> 1GB PUD page support is present for DAX purpose, so the code is there >> in mm/pagewalk.c alr

Re: [RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-02 Thread Zi Yan
On 2 Sep 2020, at 14:40, Jason Gunthorpe wrote: > On Wed, Sep 02, 2020 at 02:06:12PM -0400, Zi Yan wrote: >> From: Zi Yan >> >> Hi all, >> >> This patchset adds support for 1GB THP on x86_64. It is on top of >> v5.9-rc2-mmots-2020-08-25-21-13. >

[RFC PATCH 07/16] mm: stats: make smap stats understand PUD THPs.

2020-09-02 Thread Zi Yan
From: Zi Yan Signed-off-by: Zi Yan --- fs/proc/task_mmu.c | 63 ++ 1 file changed, 58 insertions(+), 5 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 7fc9b3cc48d3..2ff80a9c8b57 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc

[RFC PATCH 11/16] mm: thp: 1GB THP follow_p*d_page() support.

2020-09-02 Thread Zi Yan
From: Zi Yan Add follow_page support for 1GB THPs. Signed-off-by: Zi Yan --- include/linux/huge_mm.h | 11 +++ mm/gup.c| 60 - mm/huge_memory.c| 73 - 3 files changed, 142 insertions(+), 2

[RFC PATCH 09/16] mm: thp: 1GB THP support in try_to_unmap().

2020-09-02 Thread Zi Yan
From: Zi Yan Unmap different subpages in different sized THPs properly in the try_to_unmap() function. Signed-off-by: Zi Yan --- mm/migrate.c | 2 +- mm/rmap.c| 159 +-- 2 files changed, 116 insertions(+), 45 deletions(-) diff --git a/mm

[RFC PATCH 12/16] mm: support 1GB THP pagemap support.

2020-09-02 Thread Zi Yan
From: Zi Yan Print page flags properly. Signed-off-by: Zi Yan --- fs/proc/task_mmu.c | 59 ++ 1 file changed, 59 insertions(+) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 2ff80a9c8b57..7254c7ecf659 100644 --- a/fs/proc/task_mmu.c

[RFC PATCH 14/16] mm: page_alloc: >=MAX_ORDER pages allocation an deallocation.

2020-09-02 Thread Zi Yan
From: Zi Yan Use alloc_contig_pages for allocation and destroy_compound_gigantic_page for deallocation, so 1GB THP can be created and destroyed without changing MAX_ORDER. Signed-off-by: Zi Yan --- mm/hugetlb.c| 22 -- mm/internal.h | 2 ++ mm/mempolicy.c | 15

[RFC PATCH 13/16] mm: thp: add a knob to enable/disable 1GB THPs.

2020-09-02 Thread Zi Yan
From: Zi Yan It does not affect existing 1GB THPs. It is similar to the knob for 2MB THPs. Signed-off-by: Zi Yan --- include/linux/huge_mm.h | 14 ++ mm/huge_memory.c| 40 mm/memory.c | 2 +- 3 files changed, 55

[RFC PATCH 06/16] mm: thp: add 1GB THP split_huge_pud_page() function.

2020-09-02 Thread Zi Yan
From: Zi Yan It mimics PMD-level THP split. In addition, to support PMD-mapped PUD THP, PMDPageInPUD() is used. For the mapcount of PMD-mapped PUD THP, sub_compound_mapcount() is used, which uses (head_page+3).compound_mapcount, since each base page's mapcount is used for PTE mapping

[RFC PATCH 16/16] mm: thp: use cma reservation for pud thp allocation.

2020-09-02 Thread Zi Yan
From: Zi Yan Sharing hugepage_cma reservation with hugetlb for pud thp allocaiton. The reserved cma regions still can be used for moveable page allocations. During 1GB page split, all subpages are cleared from the CMA bitmap, since they are no more 1GB pages and will be freed via the normal

[RFC PATCH 15/16] hugetlb: cma: move cma reserve function to cma.c.

2020-09-02 Thread Zi Yan
From: Zi Yan It will be used by other allocations, like 1GB THP allocation in the upcoming commit. Signed-off-by: Zi Yan --- .../admin-guide/kernel-parameters.txt | 2 +- arch/arm64/mm/hugetlbpage.c | 2 +- arch/powerpc/mm/hugetlbpage.c | 2

[RFC PATCH 05/16] mm: thp: handling 1GB THP reference bit.

2020-09-02 Thread Zi Yan
From: Zi Yan Add PUD-level TLB flush ops and teach page_vma_mapped_talk about 1GB THPs. Signed-off-by: Zi Yan --- arch/x86/include/asm/pgtable.h | 3 +++ arch/x86/mm/pgtable.c | 13 + include/linux/mmu_notifier.h | 13 + include/linux/pgtable.h| 14

[RFC PATCH 08/16] mm: page_vma_walk: teach it about PMD-mapped PUD THP.

2020-09-02 Thread Zi Yan
From: Zi Yan We now have PMD-mapped PUD THP and PTE-mapped PUD THP, page_vma_walk should handle them properly. Signed-off-by: Zi Yan --- mm/page_vma_mapped.c | 116 ++- 1 file changed, 82 insertions(+), 34 deletions(-) diff --git a/mm/page_vma_mapped.c

[RFC PATCH 10/16] mm: thp: split 1GB THPs at page reclaim.

2020-09-02 Thread Zi Yan
From: Zi Yan We cannot swap 1GB THPs, so split them before swap them out. Signed-off-by: Zi Yan --- mm/swap_slots.c | 2 ++ mm/vmscan.c | 58 + 2 files changed, 46 insertions(+), 14 deletions(-) diff --git a/mm/swap_slots.c b/mm

[RFC PATCH 00/16] 1GB THP support on x86_64

2020-09-02 Thread Zi Yan
From: Zi Yan Hi all, This patchset adds support for 1GB THP on x86_64. It is on top of v5.9-rc2-mmots-2020-08-25-21-13. 1GB THP is more flexible for reducing translation overhead and increasing the performance of applications with large memory footprint without application changes compared

[RFC PATCH 02/16] mm: thp: 1GB anonymous page implementation.

2020-09-02 Thread Zi Yan
From: Zi Yan This adds 1GB THP support for anonymous pages. Applications can get 1GB pages during page faults when their VMAs are larger than 1GB. For read-only 1GB zero THP, a shared 1GB zero THP is created for all readers. Signed-off-by: Zi Yan --- arch/x86/include/asm/pgalloc.h | 59

[RFC PATCH 01/16] mm: add pagechain container for storing multiple pages.

2020-09-02 Thread Zi Yan
From: Zi Yan When depositing page table pages for 1GB THPs, we need 512 PTE pages + 1 PMD page. Instead of counting and depositing 513 pages, we can use the PMD page as a leader page and chain the rest 512 PTE pages with ->lru. This, however, prevents us depositing PMD pages with ->lru,

[RFC PATCH 04/16] mm: thp: 1GB THP copy on write implementation.

2020-09-02 Thread Zi Yan
From: Zi Yan COW on 1GB THPs will fall back to 2MB THPs if 1GB THP is not available. Signed-off-by: Zi Yan --- arch/x86/include/asm/pgalloc.h | 9 ++ include/linux/huge_mm.h| 5 mm/huge_memory.c | 54 ++ mm/memory.c

[RFC PATCH 03/16] mm: proc: add 1GB THP kpageflag.

2020-09-02 Thread Zi Yan
From: Zi Yan Bit 27 is used to identify 1GB THP. Signed-off-by: Zi Yan --- fs/proc/page.c | 2 ++ include/uapi/linux/kernel-page-flags.h | 2 ++ 2 files changed, 4 insertions(+) diff --git a/fs/proc/page.c b/fs/proc/page.c index f3b39a7d2bf3..e4e2ad3612c9 100644

Re: [PATCH V4] mm/vmstat: Add events for THP migration without split

2020-07-09 Thread Zi Yan
On 9 Jul 2020, at 12:39, Randy Dunlap wrote: > On 7/9/20 9:34 AM, Zi Yan wrote: >> On 9 Jul 2020, at 11:34, Randy Dunlap wrote: >> >>> Hi, >>> >>> I have a few comments on this. >>> >>> a. I reported it very early and should have been C

Re: [PATCH V4] mm/vmstat: Add events for THP migration without split

2020-07-09 Thread Zi Yan
On 9 Jul 2020, at 11:34, Randy Dunlap wrote: > Hi, > > I have a few comments on this. > > a. I reported it very early and should have been Cc-ed. > > b. A patch that applies to mmotm or linux-next would have been better > than a full replacement patch. > > c. I tried replacing what I believe is

Re: [RFC] [PATCH 0/8] Migrate Pages in lieu of discard

2020-07-01 Thread Zi Yan
On 30 Jun 2020, at 15:31, Dave Hansen wrote: > > >> BTW is this proposal only for systems having multi-tiers of memory? >> Can a multi-node DRAM-only system take advantage of this proposal? For >> example I have a system with two DRAM nodes running two jobs >> hardwalled to each node. For each

Re: [PATCH 13/16] mm: support THP migration to device private memory

2020-06-22 Thread Zi Yan
On 22 Jun 2020, at 17:31, Ralph Campbell wrote: > On 6/22/20 1:10 PM, Zi Yan wrote: >> On 22 Jun 2020, at 15:36, Ralph Campbell wrote: >> >>> On 6/21/20 4:20 PM, Zi Yan wrote: >>>> On 19 Jun 2020, at 17:56, Ralph Campbell wrote: >>>> >>>

Re: [PATCH 13/16] mm: support THP migration to device private memory

2020-06-22 Thread Zi Yan
On 22 Jun 2020, at 15:36, Ralph Campbell wrote: > On 6/21/20 4:20 PM, Zi Yan wrote: >> On 19 Jun 2020, at 17:56, Ralph Campbell wrote: >> >>> Support transparent huge page migration to ZONE_DEVICE private memory. >>> A new flag (MIGRATE_PFN_COMPOUND

Re: [PATCH 14/16] mm/thp: add THP allocation helper

2020-06-21 Thread Zi Yan
On 19 Jun 2020, at 17:56, Ralph Campbell wrote: > Transparent huge page allocation policy is controlled by several sysfs > variables. Rather than expose these to each device driver that needs to > allocate THPs, provide a helper function. > > Signed-off-by: Ralph Campbell > --- >

Re: [PATCH 13/16] mm: support THP migration to device private memory

2020-06-21 Thread Zi Yan
On 19 Jun 2020, at 17:56, Ralph Campbell wrote: > Support transparent huge page migration to ZONE_DEVICE private memory. > A new flag (MIGRATE_PFN_COMPOUND) is added to the input PFN array to > indicate the huge page was fully mapped by the CPU. > Export prep_compound_page() so that device

Re: [PATCH] mm: thp: remove debug_cow switch

2020-06-16 Thread Zi Yan
> Signed-off-by: Yang Shi > --- Makes sense to me. Reviewed-by: Zi Yan — Best Regards, Yan Zi signature.asc Description: OpenPGP digital signature

Re: [PATCH v6 05/51] mm: Simplify PageDoubleMap with PF_SECOND policy

2020-06-11 Thread Zi Yan
On 10 Jun 2020, at 16:12, Matthew Wilcox wrote: > From: "Matthew Wilcox (Oracle)" > > Introduce the new page policy of PF_SECOND which lets us use the > normal pageflags generation machinery to create the various DoubleMap > manipulation functions. > > Signed-off-by: Matthew Wilcox (Oracle) >

Re: [PATCH v6 04/51] mm: Move PageDoubleMap bit

2020-06-11 Thread Zi Yan
ar to have Private2 set. Use the > Workingset bit instead which is defined as PF_HEAD so any attempt to > access the Workingset bit on a tail page will redirect to the head page's > Workingset bit. > > Signed-off-by: Matthew Wilcox (Oracle) > --- Make sense to me. Reviewed-by:

Re: [PATCH V2] mm/vmstat: Add events for THP migration without split

2020-06-09 Thread Zi Yan
On 9 Jun 2020, at 7:35, Anshuman Khandual wrote: > On 06/05/2020 07:54 PM, Zi Yan wrote: >> On 4 Jun 2020, at 23:35, Anshuman Khandual wrote: >> >>> On 06/04/2020 10:19 PM, Zi Yan wrote: >>>> On 4 Jun 2020, at 12:36, Matthew Wilcox wrote: >>>> &g

Re: [PATCH V2] mm/vmstat: Add events for THP migration without split

2020-06-05 Thread Zi Yan
On 4 Jun 2020, at 23:35, Anshuman Khandual wrote: > On 06/04/2020 10:19 PM, Zi Yan wrote: >> On 4 Jun 2020, at 12:36, Matthew Wilcox wrote: >> >>> On Thu, Jun 04, 2020 at 09:51:10AM -0400, Zi Yan wrote: >>>> On 4 Jun 2020, at 7:34, Matthew Wilcox wrote: >

Re: [PATCH V2] mm/vmstat: Add events for THP migration without split

2020-06-04 Thread Zi Yan
On 4 Jun 2020, at 12:36, Matthew Wilcox wrote: > On Thu, Jun 04, 2020 at 09:51:10AM -0400, Zi Yan wrote: >> On 4 Jun 2020, at 7:34, Matthew Wilcox wrote: >>> On Thu, Jun 04, 2020 at 09:30:45AM +0530, Anshuman Khandual wrote: >>>> +Quantifying Migration >>>&

Re: [PATCH V2] mm/vmstat: Add events for THP migration without split

2020-06-04 Thread Zi Yan
On 4 Jun 2020, at 7:34, Matthew Wilcox wrote: > On Thu, Jun 04, 2020 at 09:30:45AM +0530, Anshuman Khandual wrote: >> Add the following new VM events which will help in validating THP migration >> without split. Statistics reported through these new events will help in >> performance debugging.

Re: [PATCH v4 03/36] mm: Allow hpages to be arbitrary order

2020-05-28 Thread Zi Yan
line function rather than a macro. > > Signed-off-by: Matthew Wilcox (Oracle) > --- > include/linux/huge_mm.h | 5 +-- > include/linux/mm.h | 96 - > 2 files changed, 50 insertions(+), 51 deletions(-) > Glad to see this change. Thanks.

Re: [PATCH 4/6] mm/hmm: add output flag for compound page mapping

2020-05-26 Thread Zi Yan
On 8 May 2020, at 16:06, Ralph Campbell wrote: > On 5/8/20 12:51 PM, Christoph Hellwig wrote: >> On Fri, May 08, 2020 at 12:20:07PM -0700, Ralph Campbell wrote: >>> hmm_range_fault() returns an array of page frame numbers and flags for >>> how the pages are mapped in the requested process' page

<    1   2   3   4   5   6   7   8   >