Re: [PATCH v3 07/10] thp: implement splitting pmd for huge zero page
On 10/12/2012 12:13 PM, Kirill A. Shutemov wrote: On Fri, Oct 12, 2012 at 11:23:37AM +0800, Ni zhan Chen wrote: On 10/02/2012 11:19 PM, Kirill A. Shutemov wrote: From: "Kirill A. Shutemov" We can't split huge zero page itself, but we can split the pmd which points to it. On splitting the pmd we create a table with all ptes set to normal zero page. Signed-off-by: Kirill A. Shutemov Reviewed-by: Andrea Arcangeli --- mm/huge_memory.c | 32 1 files changed, 32 insertions(+), 0 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 95032d3..3f1c59c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1600,6 +1600,7 @@ int split_huge_page(struct page *page) struct anon_vma *anon_vma; int ret = 1; + BUG_ON(is_huge_zero_pfn(page_to_pfn(page))); BUG_ON(!PageAnon(page)); anon_vma = page_lock_anon_vma(page); if (!anon_vma) @@ -2503,6 +2504,32 @@ static int khugepaged(void *none) return 0; } +static void __split_huge_zero_page_pmd(struct vm_area_struct *vma, + unsigned long haddr, pmd_t *pmd) +{ + pgtable_t pgtable; + pmd_t _pmd; + int i; + + pmdp_clear_flush_notify(vma, haddr, pmd); why I can't find function pmdp_clear_flush_notify in kernel source code? Do you mean pmdp_clear_flush_young_notify or something like that? It was changed recently. See commit 2ec74c3 mm: move all mmu notifier invocations to be done outside the PT lock Oh, thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 07/10] thp: implement splitting pmd for huge zero page
On Fri, Oct 12, 2012 at 11:23:37AM +0800, Ni zhan Chen wrote: > On 10/02/2012 11:19 PM, Kirill A. Shutemov wrote: > >From: "Kirill A. Shutemov" > > > >We can't split huge zero page itself, but we can split the pmd which > >points to it. > > > >On splitting the pmd we create a table with all ptes set to normal zero > >page. > > > >Signed-off-by: Kirill A. Shutemov > >Reviewed-by: Andrea Arcangeli > >--- > > mm/huge_memory.c | 32 > > 1 files changed, 32 insertions(+), 0 deletions(-) > > > >diff --git a/mm/huge_memory.c b/mm/huge_memory.c > >index 95032d3..3f1c59c 100644 > >--- a/mm/huge_memory.c > >+++ b/mm/huge_memory.c > >@@ -1600,6 +1600,7 @@ int split_huge_page(struct page *page) > > struct anon_vma *anon_vma; > > int ret = 1; > >+BUG_ON(is_huge_zero_pfn(page_to_pfn(page))); > > BUG_ON(!PageAnon(page)); > > anon_vma = page_lock_anon_vma(page); > > if (!anon_vma) > >@@ -2503,6 +2504,32 @@ static int khugepaged(void *none) > > return 0; > > } > >+static void __split_huge_zero_page_pmd(struct vm_area_struct *vma, > >+unsigned long haddr, pmd_t *pmd) > >+{ > >+pgtable_t pgtable; > >+pmd_t _pmd; > >+int i; > >+ > >+pmdp_clear_flush_notify(vma, haddr, pmd); > > why I can't find function pmdp_clear_flush_notify in kernel source > code? Do you mean pmdp_clear_flush_young_notify or something like > that? It was changed recently. See commit 2ec74c3 mm: move all mmu notifier invocations to be done outside the PT lock -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 07/10] thp: implement splitting pmd for huge zero page
On 10/02/2012 11:19 PM, Kirill A. Shutemov wrote: From: "Kirill A. Shutemov" We can't split huge zero page itself, but we can split the pmd which points to it. On splitting the pmd we create a table with all ptes set to normal zero page. Signed-off-by: Kirill A. Shutemov Reviewed-by: Andrea Arcangeli --- mm/huge_memory.c | 32 1 files changed, 32 insertions(+), 0 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 95032d3..3f1c59c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1600,6 +1600,7 @@ int split_huge_page(struct page *page) struct anon_vma *anon_vma; int ret = 1; + BUG_ON(is_huge_zero_pfn(page_to_pfn(page))); BUG_ON(!PageAnon(page)); anon_vma = page_lock_anon_vma(page); if (!anon_vma) @@ -2503,6 +2504,32 @@ static int khugepaged(void *none) return 0; } +static void __split_huge_zero_page_pmd(struct vm_area_struct *vma, + unsigned long haddr, pmd_t *pmd) +{ + pgtable_t pgtable; + pmd_t _pmd; + int i; + + pmdp_clear_flush_notify(vma, haddr, pmd); why I can't find function pmdp_clear_flush_notify in kernel source code? Do you mean pmdp_clear_flush_young_notify or something like that? + /* leave pmd empty until pte is filled */ + + pgtable = get_pmd_huge_pte(vma->vm_mm); + pmd_populate(vma->vm_mm, &_pmd, pgtable); + + for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) { + pte_t *pte, entry; + entry = pfn_pte(my_zero_pfn(haddr), vma->vm_page_prot); + entry = pte_mkspecial(entry); + pte = pte_offset_map(&_pmd, haddr); + VM_BUG_ON(!pte_none(*pte)); + set_pte_at(vma->vm_mm, haddr, pte, entry); + pte_unmap(pte); + } + smp_wmb(); /* make pte visible before pmd */ + pmd_populate(vma->vm_mm, pmd, pgtable); +} + void __split_huge_page_pmd(struct vm_area_struct *vma, unsigned long address, pmd_t *pmd) { @@ -2516,6 +2543,11 @@ void __split_huge_page_pmd(struct vm_area_struct *vma, unsigned long address, spin_unlock(>vm_mm->page_table_lock); return; } + if (is_huge_zero_pmd(*pmd)) { + __split_huge_zero_page_pmd(vma, haddr, pmd); + spin_unlock(>vm_mm->page_table_lock); + return; + } page = pmd_page(*pmd); VM_BUG_ON(!page_count(page)); get_page(page); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 07/10] thp: implement splitting pmd for huge zero page
On 10/02/2012 11:19 PM, Kirill A. Shutemov wrote: From: Kirill A. Shutemov kirill.shute...@linux.intel.com We can't split huge zero page itself, but we can split the pmd which points to it. On splitting the pmd we create a table with all ptes set to normal zero page. Signed-off-by: Kirill A. Shutemov kirill.shute...@linux.intel.com Reviewed-by: Andrea Arcangeli aarca...@redhat.com --- mm/huge_memory.c | 32 1 files changed, 32 insertions(+), 0 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 95032d3..3f1c59c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1600,6 +1600,7 @@ int split_huge_page(struct page *page) struct anon_vma *anon_vma; int ret = 1; + BUG_ON(is_huge_zero_pfn(page_to_pfn(page))); BUG_ON(!PageAnon(page)); anon_vma = page_lock_anon_vma(page); if (!anon_vma) @@ -2503,6 +2504,32 @@ static int khugepaged(void *none) return 0; } +static void __split_huge_zero_page_pmd(struct vm_area_struct *vma, + unsigned long haddr, pmd_t *pmd) +{ + pgtable_t pgtable; + pmd_t _pmd; + int i; + + pmdp_clear_flush_notify(vma, haddr, pmd); why I can't find function pmdp_clear_flush_notify in kernel source code? Do you mean pmdp_clear_flush_young_notify or something like that? + /* leave pmd empty until pte is filled */ + + pgtable = get_pmd_huge_pte(vma-vm_mm); + pmd_populate(vma-vm_mm, _pmd, pgtable); + + for (i = 0; i HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) { + pte_t *pte, entry; + entry = pfn_pte(my_zero_pfn(haddr), vma-vm_page_prot); + entry = pte_mkspecial(entry); + pte = pte_offset_map(_pmd, haddr); + VM_BUG_ON(!pte_none(*pte)); + set_pte_at(vma-vm_mm, haddr, pte, entry); + pte_unmap(pte); + } + smp_wmb(); /* make pte visible before pmd */ + pmd_populate(vma-vm_mm, pmd, pgtable); +} + void __split_huge_page_pmd(struct vm_area_struct *vma, unsigned long address, pmd_t *pmd) { @@ -2516,6 +2543,11 @@ void __split_huge_page_pmd(struct vm_area_struct *vma, unsigned long address, spin_unlock(vma-vm_mm-page_table_lock); return; } + if (is_huge_zero_pmd(*pmd)) { + __split_huge_zero_page_pmd(vma, haddr, pmd); + spin_unlock(vma-vm_mm-page_table_lock); + return; + } page = pmd_page(*pmd); VM_BUG_ON(!page_count(page)); get_page(page); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 07/10] thp: implement splitting pmd for huge zero page
On Fri, Oct 12, 2012 at 11:23:37AM +0800, Ni zhan Chen wrote: On 10/02/2012 11:19 PM, Kirill A. Shutemov wrote: From: Kirill A. Shutemov kirill.shute...@linux.intel.com We can't split huge zero page itself, but we can split the pmd which points to it. On splitting the pmd we create a table with all ptes set to normal zero page. Signed-off-by: Kirill A. Shutemov kirill.shute...@linux.intel.com Reviewed-by: Andrea Arcangeli aarca...@redhat.com --- mm/huge_memory.c | 32 1 files changed, 32 insertions(+), 0 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 95032d3..3f1c59c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1600,6 +1600,7 @@ int split_huge_page(struct page *page) struct anon_vma *anon_vma; int ret = 1; +BUG_ON(is_huge_zero_pfn(page_to_pfn(page))); BUG_ON(!PageAnon(page)); anon_vma = page_lock_anon_vma(page); if (!anon_vma) @@ -2503,6 +2504,32 @@ static int khugepaged(void *none) return 0; } +static void __split_huge_zero_page_pmd(struct vm_area_struct *vma, +unsigned long haddr, pmd_t *pmd) +{ +pgtable_t pgtable; +pmd_t _pmd; +int i; + +pmdp_clear_flush_notify(vma, haddr, pmd); why I can't find function pmdp_clear_flush_notify in kernel source code? Do you mean pmdp_clear_flush_young_notify or something like that? It was changed recently. See commit 2ec74c3 mm: move all mmu notifier invocations to be done outside the PT lock -- Kirill A. Shutemov -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 07/10] thp: implement splitting pmd for huge zero page
On 10/12/2012 12:13 PM, Kirill A. Shutemov wrote: On Fri, Oct 12, 2012 at 11:23:37AM +0800, Ni zhan Chen wrote: On 10/02/2012 11:19 PM, Kirill A. Shutemov wrote: From: Kirill A. Shutemov kirill.shute...@linux.intel.com We can't split huge zero page itself, but we can split the pmd which points to it. On splitting the pmd we create a table with all ptes set to normal zero page. Signed-off-by: Kirill A. Shutemov kirill.shute...@linux.intel.com Reviewed-by: Andrea Arcangeli aarca...@redhat.com --- mm/huge_memory.c | 32 1 files changed, 32 insertions(+), 0 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 95032d3..3f1c59c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1600,6 +1600,7 @@ int split_huge_page(struct page *page) struct anon_vma *anon_vma; int ret = 1; + BUG_ON(is_huge_zero_pfn(page_to_pfn(page))); BUG_ON(!PageAnon(page)); anon_vma = page_lock_anon_vma(page); if (!anon_vma) @@ -2503,6 +2504,32 @@ static int khugepaged(void *none) return 0; } +static void __split_huge_zero_page_pmd(struct vm_area_struct *vma, + unsigned long haddr, pmd_t *pmd) +{ + pgtable_t pgtable; + pmd_t _pmd; + int i; + + pmdp_clear_flush_notify(vma, haddr, pmd); why I can't find function pmdp_clear_flush_notify in kernel source code? Do you mean pmdp_clear_flush_young_notify or something like that? It was changed recently. See commit 2ec74c3 mm: move all mmu notifier invocations to be done outside the PT lock Oh, thanks! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 07/10] thp: implement splitting pmd for huge zero page
From: "Kirill A. Shutemov" We can't split huge zero page itself, but we can split the pmd which points to it. On splitting the pmd we create a table with all ptes set to normal zero page. Signed-off-by: Kirill A. Shutemov Reviewed-by: Andrea Arcangeli --- mm/huge_memory.c | 32 1 files changed, 32 insertions(+), 0 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 95032d3..3f1c59c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1600,6 +1600,7 @@ int split_huge_page(struct page *page) struct anon_vma *anon_vma; int ret = 1; + BUG_ON(is_huge_zero_pfn(page_to_pfn(page))); BUG_ON(!PageAnon(page)); anon_vma = page_lock_anon_vma(page); if (!anon_vma) @@ -2503,6 +2504,32 @@ static int khugepaged(void *none) return 0; } +static void __split_huge_zero_page_pmd(struct vm_area_struct *vma, + unsigned long haddr, pmd_t *pmd) +{ + pgtable_t pgtable; + pmd_t _pmd; + int i; + + pmdp_clear_flush_notify(vma, haddr, pmd); + /* leave pmd empty until pte is filled */ + + pgtable = get_pmd_huge_pte(vma->vm_mm); + pmd_populate(vma->vm_mm, &_pmd, pgtable); + + for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) { + pte_t *pte, entry; + entry = pfn_pte(my_zero_pfn(haddr), vma->vm_page_prot); + entry = pte_mkspecial(entry); + pte = pte_offset_map(&_pmd, haddr); + VM_BUG_ON(!pte_none(*pte)); + set_pte_at(vma->vm_mm, haddr, pte, entry); + pte_unmap(pte); + } + smp_wmb(); /* make pte visible before pmd */ + pmd_populate(vma->vm_mm, pmd, pgtable); +} + void __split_huge_page_pmd(struct vm_area_struct *vma, unsigned long address, pmd_t *pmd) { @@ -2516,6 +2543,11 @@ void __split_huge_page_pmd(struct vm_area_struct *vma, unsigned long address, spin_unlock(>vm_mm->page_table_lock); return; } + if (is_huge_zero_pmd(*pmd)) { + __split_huge_zero_page_pmd(vma, haddr, pmd); + spin_unlock(>vm_mm->page_table_lock); + return; + } page = pmd_page(*pmd); VM_BUG_ON(!page_count(page)); get_page(page); -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 07/10] thp: implement splitting pmd for huge zero page
From: Kirill A. Shutemov kirill.shute...@linux.intel.com We can't split huge zero page itself, but we can split the pmd which points to it. On splitting the pmd we create a table with all ptes set to normal zero page. Signed-off-by: Kirill A. Shutemov kirill.shute...@linux.intel.com Reviewed-by: Andrea Arcangeli aarca...@redhat.com --- mm/huge_memory.c | 32 1 files changed, 32 insertions(+), 0 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 95032d3..3f1c59c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1600,6 +1600,7 @@ int split_huge_page(struct page *page) struct anon_vma *anon_vma; int ret = 1; + BUG_ON(is_huge_zero_pfn(page_to_pfn(page))); BUG_ON(!PageAnon(page)); anon_vma = page_lock_anon_vma(page); if (!anon_vma) @@ -2503,6 +2504,32 @@ static int khugepaged(void *none) return 0; } +static void __split_huge_zero_page_pmd(struct vm_area_struct *vma, + unsigned long haddr, pmd_t *pmd) +{ + pgtable_t pgtable; + pmd_t _pmd; + int i; + + pmdp_clear_flush_notify(vma, haddr, pmd); + /* leave pmd empty until pte is filled */ + + pgtable = get_pmd_huge_pte(vma-vm_mm); + pmd_populate(vma-vm_mm, _pmd, pgtable); + + for (i = 0; i HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) { + pte_t *pte, entry; + entry = pfn_pte(my_zero_pfn(haddr), vma-vm_page_prot); + entry = pte_mkspecial(entry); + pte = pte_offset_map(_pmd, haddr); + VM_BUG_ON(!pte_none(*pte)); + set_pte_at(vma-vm_mm, haddr, pte, entry); + pte_unmap(pte); + } + smp_wmb(); /* make pte visible before pmd */ + pmd_populate(vma-vm_mm, pmd, pgtable); +} + void __split_huge_page_pmd(struct vm_area_struct *vma, unsigned long address, pmd_t *pmd) { @@ -2516,6 +2543,11 @@ void __split_huge_page_pmd(struct vm_area_struct *vma, unsigned long address, spin_unlock(vma-vm_mm-page_table_lock); return; } + if (is_huge_zero_pmd(*pmd)) { + __split_huge_zero_page_pmd(vma, haddr, pmd); + spin_unlock(vma-vm_mm-page_table_lock); + return; + } page = pmd_page(*pmd); VM_BUG_ON(!page_count(page)); get_page(page); -- 1.7.7.6 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/