Re: [PATCH 1/8] mm: Add optional support for PUD-sized transparent hugepages
On Sat, Jan 02, 2016 at 12:06:38PM -0500, Matthew Wilcox wrote: > On Mon, Dec 28, 2015 at 12:05:51PM +0200, Kirill A. Shutemov wrote: > > On Thu, Dec 24, 2015 at 11:20:30AM -0500, Matthew Wilcox wrote: > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > > index 4bf3811..e14634f 100644 > > > --- a/include/linux/mm.h > > > +++ b/include/linux/mm.h > > > @@ -1958,6 +1977,17 @@ static inline spinlock_t *pmd_lock(struct > > > mm_struct *mm, pmd_t *pmd) > > > return ptl; > > > } > > > > > > +/* > > > + * No scalability reason to split PUD locks yet, but follow the same > > > pattern > > > + * as the PMD locks to make it easier if we have to. > > > + */ > > > > I don't think it makes any good unless you convert all other places where > > we use page_table_lock to protect pud table (like __pud_alloc()) to the > > same API. > > I think this would deserve separate patch. > > Sure, a separate patch to convert existing users of the PTL. But I > don't think it does any harm to introduce the PUD version of the PMD API. > Maybe with a comment indicating that tere is significant work to be done > in converting existing users to this API? I think that's fine with the fat comment around pud_lock() definition. > > > diff --git a/mm/memory.c b/mm/memory.c > > > index 416b129..7328df0 100644 > > > --- a/mm/memory.c > > > +++ b/mm/memory.c > > > @@ -1220,9 +1220,27 @@ static inline unsigned long zap_pud_range(struct > > > mmu_gather *tlb, > > > pud = pud_offset(pgd, addr); > > > do { > > > next = pud_addr_end(addr, end); > > > + if (pud_trans_huge(*pud) || pud_devmap(*pud)) { > > > + if (next - addr != HPAGE_PUD_SIZE) { > > > +#ifdef CONFIG_DEBUG_VM > > > > IS_ENABLED(CONFIG_DEBUG_VM) ? > > > > > + if (!rwsem_is_locked(>mm->mmap_sem)) { > > > + pr_err("%s: mmap_sem is unlocked! > > > addr=0x%lx end=0x%lx vma->vm_start=0x%lx vma->vm_end=0x%lx\n", > > > + __func__, addr, end, > > > + vma->vm_start, > > > + vma->vm_end); > > > > dump_vma(), I guess. > > These two issues are copy-and-paste from the existing PMD code. I'm happy > to update the PMD code to the new-and-improved way of doing things; > I'm just not keen to have the PMD and PUD code diverge unnecessarily. Yes, please update PMD too. It looks ugly. VM_BUG_ON_VMA() is probably right way to deal with this. -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] mm: Add optional support for PUD-sized transparent hugepages
On Sat, Jan 02, 2016 at 12:06:38PM -0500, Matthew Wilcox wrote: > On Mon, Dec 28, 2015 at 12:05:51PM +0200, Kirill A. Shutemov wrote: > > On Thu, Dec 24, 2015 at 11:20:30AM -0500, Matthew Wilcox wrote: > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > > index 4bf3811..e14634f 100644 > > > --- a/include/linux/mm.h > > > +++ b/include/linux/mm.h > > > @@ -1958,6 +1977,17 @@ static inline spinlock_t *pmd_lock(struct > > > mm_struct *mm, pmd_t *pmd) > > > return ptl; > > > } > > > > > > +/* > > > + * No scalability reason to split PUD locks yet, but follow the same > > > pattern > > > + * as the PMD locks to make it easier if we have to. > > > + */ > > > > I don't think it makes any good unless you convert all other places where > > we use page_table_lock to protect pud table (like __pud_alloc()) to the > > same API. > > I think this would deserve separate patch. > > Sure, a separate patch to convert existing users of the PTL. But I > don't think it does any harm to introduce the PUD version of the PMD API. > Maybe with a comment indicating that tere is significant work to be done > in converting existing users to this API? I think that's fine with the fat comment around pud_lock() definition. > > > diff --git a/mm/memory.c b/mm/memory.c > > > index 416b129..7328df0 100644 > > > --- a/mm/memory.c > > > +++ b/mm/memory.c > > > @@ -1220,9 +1220,27 @@ static inline unsigned long zap_pud_range(struct > > > mmu_gather *tlb, > > > pud = pud_offset(pgd, addr); > > > do { > > > next = pud_addr_end(addr, end); > > > + if (pud_trans_huge(*pud) || pud_devmap(*pud)) { > > > + if (next - addr != HPAGE_PUD_SIZE) { > > > +#ifdef CONFIG_DEBUG_VM > > > > IS_ENABLED(CONFIG_DEBUG_VM) ? > > > > > + if (!rwsem_is_locked(>mm->mmap_sem)) { > > > + pr_err("%s: mmap_sem is unlocked! > > > addr=0x%lx end=0x%lx vma->vm_start=0x%lx vma->vm_end=0x%lx\n", > > > + __func__, addr, end, > > > + vma->vm_start, > > > + vma->vm_end); > > > > dump_vma(), I guess. > > These two issues are copy-and-paste from the existing PMD code. I'm happy > to update the PMD code to the new-and-improved way of doing things; > I'm just not keen to have the PMD and PUD code diverge unnecessarily. Yes, please update PMD too. It looks ugly. VM_BUG_ON_VMA() is probably right way to deal with this. -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] mm: Add optional support for PUD-sized transparent hugepages
On Mon, Dec 28, 2015 at 12:05:51PM +0200, Kirill A. Shutemov wrote: > On Thu, Dec 24, 2015 at 11:20:30AM -0500, Matthew Wilcox wrote: > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 4bf3811..e14634f 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -1958,6 +1977,17 @@ static inline spinlock_t *pmd_lock(struct mm_struct > > *mm, pmd_t *pmd) > > return ptl; > > } > > > > +/* > > + * No scalability reason to split PUD locks yet, but follow the same > > pattern > > + * as the PMD locks to make it easier if we have to. > > + */ > > I don't think it makes any good unless you convert all other places where > we use page_table_lock to protect pud table (like __pud_alloc()) to the > same API. > I think this would deserve separate patch. Sure, a separate patch to convert existing users of the PTL. But I don't think it does any harm to introduce the PUD version of the PMD API. Maybe with a comment indicating that tere is significant work to be done in converting existing users to this API? > > diff --git a/mm/memory.c b/mm/memory.c > > index 416b129..7328df0 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -1220,9 +1220,27 @@ static inline unsigned long zap_pud_range(struct > > mmu_gather *tlb, > > pud = pud_offset(pgd, addr); > > do { > > next = pud_addr_end(addr, end); > > + if (pud_trans_huge(*pud) || pud_devmap(*pud)) { > > + if (next - addr != HPAGE_PUD_SIZE) { > > +#ifdef CONFIG_DEBUG_VM > > IS_ENABLED(CONFIG_DEBUG_VM) ? > > > + if (!rwsem_is_locked(>mm->mmap_sem)) { > > + pr_err("%s: mmap_sem is unlocked! > > addr=0x%lx end=0x%lx vma->vm_start=0x%lx vma->vm_end=0x%lx\n", > > + __func__, addr, end, > > + vma->vm_start, > > + vma->vm_end); > > dump_vma(), I guess. These two issues are copy-and-paste from the existing PMD code. I'm happy to update the PMD code to the new-and-improved way of doing things; I'm just not keen to have the PMD and PUD code diverge unnecessarily. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] mm: Add optional support for PUD-sized transparent hugepages
On Mon, Dec 28, 2015 at 12:05:51PM +0200, Kirill A. Shutemov wrote: > On Thu, Dec 24, 2015 at 11:20:30AM -0500, Matthew Wilcox wrote: > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 4bf3811..e14634f 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -1958,6 +1977,17 @@ static inline spinlock_t *pmd_lock(struct mm_struct > > *mm, pmd_t *pmd) > > return ptl; > > } > > > > +/* > > + * No scalability reason to split PUD locks yet, but follow the same > > pattern > > + * as the PMD locks to make it easier if we have to. > > + */ > > I don't think it makes any good unless you convert all other places where > we use page_table_lock to protect pud table (like __pud_alloc()) to the > same API. > I think this would deserve separate patch. Sure, a separate patch to convert existing users of the PTL. But I don't think it does any harm to introduce the PUD version of the PMD API. Maybe with a comment indicating that tere is significant work to be done in converting existing users to this API? > > diff --git a/mm/memory.c b/mm/memory.c > > index 416b129..7328df0 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -1220,9 +1220,27 @@ static inline unsigned long zap_pud_range(struct > > mmu_gather *tlb, > > pud = pud_offset(pgd, addr); > > do { > > next = pud_addr_end(addr, end); > > + if (pud_trans_huge(*pud) || pud_devmap(*pud)) { > > + if (next - addr != HPAGE_PUD_SIZE) { > > +#ifdef CONFIG_DEBUG_VM > > IS_ENABLED(CONFIG_DEBUG_VM) ? > > > + if (!rwsem_is_locked(>mm->mmap_sem)) { > > + pr_err("%s: mmap_sem is unlocked! > > addr=0x%lx end=0x%lx vma->vm_start=0x%lx vma->vm_end=0x%lx\n", > > + __func__, addr, end, > > + vma->vm_start, > > + vma->vm_end); > > dump_vma(), I guess. These two issues are copy-and-paste from the existing PMD code. I'm happy to update the PMD code to the new-and-improved way of doing things; I'm just not keen to have the PMD and PUD code diverge unnecessarily. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] mm: Add optional support for PUD-sized transparent hugepages
On Thu, Dec 24, 2015 at 11:20:30AM -0500, Matthew Wilcox wrote: > The only major difference is how the new ->pud_entry method in mm_walk > works. The ->pmd_entry method replaces the ->pte_entry method, whereas > the ->pud_entry method works along with either ->pmd_entry or ->pte_entry. I think it makes pagewalk API confusing. We need something more coherent. -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] mm: Add optional support for PUD-sized transparent hugepages
On Thu, Dec 24, 2015 at 11:20:30AM -0500, Matthew Wilcox wrote: > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 4bf3811..e14634f 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1958,6 +1977,17 @@ static inline spinlock_t *pmd_lock(struct mm_struct > *mm, pmd_t *pmd) > return ptl; > } > > +/* > + * No scalability reason to split PUD locks yet, but follow the same pattern > + * as the PMD locks to make it easier if we have to. > + */ I don't think it makes any good unless you convert all other places where we use page_table_lock to protect pud table (like __pud_alloc()) to the same API. I think this would deserve separate patch. > +static inline spinlock_t *pud_lock(struct mm_struct *mm, pud_t *pud) > +{ > + spinlock_t *ptl = >page_table_lock; > + spin_lock(ptl); > + return ptl; > +} > + > extern void free_area_init(unsigned long * zones_size); > extern void free_area_init_node(int nid, unsigned long * zones_size, > unsigned long zone_start_pfn, unsigned long *zholes_size); ... > diff --git a/mm/memory.c b/mm/memory.c > index 416b129..7328df0 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1220,9 +1220,27 @@ static inline unsigned long zap_pud_range(struct > mmu_gather *tlb, > pud = pud_offset(pgd, addr); > do { > next = pud_addr_end(addr, end); > + if (pud_trans_huge(*pud) || pud_devmap(*pud)) { > + if (next - addr != HPAGE_PUD_SIZE) { > +#ifdef CONFIG_DEBUG_VM IS_ENABLED(CONFIG_DEBUG_VM) ? > + if (!rwsem_is_locked(>mm->mmap_sem)) { > + pr_err("%s: mmap_sem is unlocked! > addr=0x%lx end=0x%lx vma->vm_start=0x%lx vma->vm_end=0x%lx\n", > + __func__, addr, end, > + vma->vm_start, > + vma->vm_end); dump_vma(), I guess. > + BUG(); > + } > +#endif > + split_huge_pud(vma, pud, addr); > + } else if (zap_huge_pud(tlb, vma, pud, addr)) > + goto next; > + /* fall through */ > + } > if (pud_none_or_clear_bad(pud)) > continue; > next = zap_pmd_range(tlb, vma, pud, addr, next, details); > +next: > + cond_resched(); > } while (pud++, addr = next, addr != end); > > return addr; -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] mm: Add optional support for PUD-sized transparent hugepages
On Thu, Dec 24, 2015 at 11:20:30AM -0500, Matthew Wilcox wrote: > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 4bf3811..e14634f 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1958,6 +1977,17 @@ static inline spinlock_t *pmd_lock(struct mm_struct > *mm, pmd_t *pmd) > return ptl; > } > > +/* > + * No scalability reason to split PUD locks yet, but follow the same pattern > + * as the PMD locks to make it easier if we have to. > + */ I don't think it makes any good unless you convert all other places where we use page_table_lock to protect pud table (like __pud_alloc()) to the same API. I think this would deserve separate patch. > +static inline spinlock_t *pud_lock(struct mm_struct *mm, pud_t *pud) > +{ > + spinlock_t *ptl = >page_table_lock; > + spin_lock(ptl); > + return ptl; > +} > + > extern void free_area_init(unsigned long * zones_size); > extern void free_area_init_node(int nid, unsigned long * zones_size, > unsigned long zone_start_pfn, unsigned long *zholes_size); ... > diff --git a/mm/memory.c b/mm/memory.c > index 416b129..7328df0 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1220,9 +1220,27 @@ static inline unsigned long zap_pud_range(struct > mmu_gather *tlb, > pud = pud_offset(pgd, addr); > do { > next = pud_addr_end(addr, end); > + if (pud_trans_huge(*pud) || pud_devmap(*pud)) { > + if (next - addr != HPAGE_PUD_SIZE) { > +#ifdef CONFIG_DEBUG_VM IS_ENABLED(CONFIG_DEBUG_VM) ? > + if (!rwsem_is_locked(>mm->mmap_sem)) { > + pr_err("%s: mmap_sem is unlocked! > addr=0x%lx end=0x%lx vma->vm_start=0x%lx vma->vm_end=0x%lx\n", > + __func__, addr, end, > + vma->vm_start, > + vma->vm_end); dump_vma(), I guess. > + BUG(); > + } > +#endif > + split_huge_pud(vma, pud, addr); > + } else if (zap_huge_pud(tlb, vma, pud, addr)) > + goto next; > + /* fall through */ > + } > if (pud_none_or_clear_bad(pud)) > continue; > next = zap_pmd_range(tlb, vma, pud, addr, next, details); > +next: > + cond_resched(); > } while (pud++, addr = next, addr != end); > > return addr; -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] mm: Add optional support for PUD-sized transparent hugepages
On Thu, Dec 24, 2015 at 11:20:30AM -0500, Matthew Wilcox wrote: > The only major difference is how the new ->pud_entry method in mm_walk > works. The ->pmd_entry method replaces the ->pte_entry method, whereas > the ->pud_entry method works along with either ->pmd_entry or ->pte_entry. I think it makes pagewalk API confusing. We need something more coherent. -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/