Re: [PATCH] mm: thp: Set the accessed flag for old pages on access fault.
On 10/01/2012 10:59 PM, Andrea Arcangeli wrote: Hi Will, On Mon, Oct 01, 2012 at 02:51:45PM +0100, Will Deacon wrote: +void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long address, pmd_t *pmd, pmd_t orig_pmd) +{ + pmd_t entry; + + spin_lock(>page_table_lock); + entry = pmd_mkyoung(orig_pmd); + if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry, 0)) + update_mmu_cache(vma, address, pmd); If the pmd is being splitted, this may not be a trasnhuge pmd anymore by the time you obtained the lock. (orig_pmd could be stale, and it wasn't verified with pmd_same either) Could you tell me when should call pmd_same in general? The lock should be obtained through pmd_trans_huge_lock. if (pmd_trans_huge_lock(orig_pmd, vma) == 1) { set young bit spin_unlock(>page_table_lock); } On x86: int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t entry, int dirty) { int changed = !pmd_same(*pmdp, entry); VM_BUG_ON(address & ~HPAGE_PMD_MASK); if (changed && dirty) { *pmdp = entry; with dirty == 0 it looks like it won't make any difference, but I guess your arm pmdp_set_access_flag is different. However it seems "dirty" means write access and so the invocation would better match the pte case: if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry, flags & FAULT_FLAG_WRITE)) But note, you still have to update it even when "dirty" == 0, or it'll still infinite loop for read accesses. + spin_unlock(>page_table_lock); +} + int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, pmd_t *pmd, pmd_t orig_pmd) { diff --git a/mm/memory.c b/mm/memory.c index 5736170..d5c007d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3537,7 +3537,11 @@ retry: if (unlikely(ret & VM_FAULT_OOM)) goto retry; return ret; + } else { + huge_pmd_set_accessed(mm, vma, address, pmd, + orig_pmd); } + return 0; Thanks, Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: mailto:"d...@kvack.org;> em...@kvack.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: thp: Set the accessed flag for old pages on access fault.
On 10/01/2012 10:59 PM, Andrea Arcangeli wrote: Hi Will, On Mon, Oct 01, 2012 at 02:51:45PM +0100, Will Deacon wrote: +void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long address, pmd_t *pmd, pmd_t orig_pmd) +{ + pmd_t entry; + + spin_lock(mm-page_table_lock); + entry = pmd_mkyoung(orig_pmd); + if (pmdp_set_access_flags(vma, address HPAGE_PMD_MASK, pmd, entry, 0)) + update_mmu_cache(vma, address, pmd); If the pmd is being splitted, this may not be a trasnhuge pmd anymore by the time you obtained the lock. (orig_pmd could be stale, and it wasn't verified with pmd_same either) Could you tell me when should call pmd_same in general? The lock should be obtained through pmd_trans_huge_lock. if (pmd_trans_huge_lock(orig_pmd, vma) == 1) { set young bit spin_unlock(mm-page_table_lock); } On x86: int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t entry, int dirty) { int changed = !pmd_same(*pmdp, entry); VM_BUG_ON(address ~HPAGE_PMD_MASK); if (changed dirty) { *pmdp = entry; with dirty == 0 it looks like it won't make any difference, but I guess your arm pmdp_set_access_flag is different. However it seems dirty means write access and so the invocation would better match the pte case: if (pmdp_set_access_flags(vma, address HPAGE_PMD_MASK, pmd, entry, flags FAULT_FLAG_WRITE)) But note, you still have to update it even when dirty == 0, or it'll still infinite loop for read accesses. + spin_unlock(mm-page_table_lock); +} + int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, pmd_t *pmd, pmd_t orig_pmd) { diff --git a/mm/memory.c b/mm/memory.c index 5736170..d5c007d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3537,7 +3537,11 @@ retry: if (unlikely(ret VM_FAULT_OOM)) goto retry; return ret; + } else { + huge_pmd_set_accessed(mm, vma, address, pmd, + orig_pmd); } + return 0; Thanks, Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: thp: Set the accessed flag for old pages on access fault.
On Mon, Oct 01, 2012 at 03:59:44PM +0100, Andrea Arcangeli wrote: > Hi Will, Hi Andrea, Kirill, Thanks for the comments. > On Mon, Oct 01, 2012 at 02:51:45PM +0100, Will Deacon wrote: > > +void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct > > *vma, > > + unsigned long address, pmd_t *pmd, pmd_t orig_pmd) > > +{ > > + pmd_t entry; > > + > > + spin_lock(>page_table_lock); > > + entry = pmd_mkyoung(orig_pmd); > > + if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry, 0)) > > + update_mmu_cache(vma, address, pmd); > > If the pmd is being splitted, this may not be a trasnhuge pmd anymore > by the time you obtained the lock. (orig_pmd could be stale, and it > wasn't verified with pmd_same either) > > The lock should be obtained through pmd_trans_huge_lock. > > if (pmd_trans_huge_lock(orig_pmd, vma) == 1) > { > set young bit > spin_unlock(>page_table_lock); > } I didn't notice that -- thanks. I'll move the locking outside of the _set_accessed function and direct it via that function instead. > On x86: > > int pmdp_set_access_flags(struct vm_area_struct *vma, > unsigned long address, pmd_t *pmdp, > pmd_t entry, int dirty) > { > int changed = !pmd_same(*pmdp, entry); > > VM_BUG_ON(address & ~HPAGE_PMD_MASK); > > if (changed && dirty) { > *pmdp = entry; > > with dirty == 0 it looks like it won't make any difference, but I > guess your arm pmdp_set_access_flag is different. We use the generic code, which ignores the dirty argument. Still, we should pass the correct value through anyway, so I'll fix that too. > However it seems "dirty" means write access and so the invocation > would better match the pte case: > > if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry, > flags & FAULT_FLAG_WRITE)) > > > But note, you still have to update it even when "dirty" == 0, or it'll > still infinite loop for read accesses. Yup. v2 to follow once we've re-run our testing. Cheers, Will -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: thp: Set the accessed flag for old pages on access fault.
Hi Will, On Mon, Oct 01, 2012 at 02:51:45PM +0100, Will Deacon wrote: > +void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma, > +unsigned long address, pmd_t *pmd, pmd_t orig_pmd) > +{ > + pmd_t entry; > + > + spin_lock(>page_table_lock); > + entry = pmd_mkyoung(orig_pmd); > + if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry, 0)) > + update_mmu_cache(vma, address, pmd); If the pmd is being splitted, this may not be a trasnhuge pmd anymore by the time you obtained the lock. (orig_pmd could be stale, and it wasn't verified with pmd_same either) The lock should be obtained through pmd_trans_huge_lock. if (pmd_trans_huge_lock(orig_pmd, vma) == 1) { set young bit spin_unlock(>page_table_lock); } On x86: int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t entry, int dirty) { int changed = !pmd_same(*pmdp, entry); VM_BUG_ON(address & ~HPAGE_PMD_MASK); if (changed && dirty) { *pmdp = entry; with dirty == 0 it looks like it won't make any difference, but I guess your arm pmdp_set_access_flag is different. However it seems "dirty" means write access and so the invocation would better match the pte case: if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry, flags & FAULT_FLAG_WRITE)) But note, you still have to update it even when "dirty" == 0, or it'll still infinite loop for read accesses. > + spin_unlock(>page_table_lock); > +} > + > int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma, > unsigned long address, pmd_t *pmd, pmd_t orig_pmd) > { > diff --git a/mm/memory.c b/mm/memory.c > index 5736170..d5c007d 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3537,7 +3537,11 @@ retry: > if (unlikely(ret & VM_FAULT_OOM)) > goto retry; > return ret; > + } else { > + huge_pmd_set_accessed(mm, vma, address, pmd, > + orig_pmd); > } > + > return 0; Thanks, Andrea -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: thp: Set the accessed flag for old pages on access fault.
On Mon, Oct 01, 2012 at 02:51:45PM +0100, Will Deacon wrote: > diff --git a/mm/memory.c b/mm/memory.c > index 5736170..d5c007d 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3537,7 +3537,11 @@ retry: > if (unlikely(ret & VM_FAULT_OOM)) > goto retry; > return ret; > + } else { > + huge_pmd_set_accessed(mm, vma, address, pmd, > + orig_pmd); I think putting it to 'else' is wrong. You should not touch pmd, if it's under splitting. > } > + > return 0; > } > } -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: thp: Set the accessed flag for old pages on access fault.
From: Steve Capper On x86 memory accesses to pages without the ACCESSED flag set result in the ACCESSED flag being set automatically. With the ARM architecture a page access fault is raised instead (and it will continue to be raised until the ACCESSED flag is set for the appropriate PTE/PMD). For normal memory pages, handle_pte_fault will call pte_mkyoung (effectively setting the ACCESSED flag). For transparent huge pages, pmd_mkyoung will only be called for a write fault. This patch ensures that faults on transparent hugepages which do not result in a CoW update the access flags for the faulting pmd. Cc: Andrea Arcangeli Cc: Chris Metcalf Signed-off-by: Steve Capper Signed-off-by: Will Deacon --- Hello again, This is another fix for an issue that we discovered when porting THP to ARM but it somehow managed to slip through the cracks. Will include/linux/huge_mm.h |3 +++ mm/huge_memory.c| 12 mm/memory.c |4 3 files changed, 19 insertions(+), 0 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 4c59b11..bbc62ad 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -8,6 +8,9 @@ extern int do_huge_pmd_anonymous_page(struct mm_struct *mm, extern int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr, struct vm_area_struct *vma); +extern void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long address, pmd_t *pmd, + pmd_t orig_pmd); extern int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, pmd_t *pmd, pmd_t orig_pmd); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index d684934..ee9cc3b 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -914,6 +914,18 @@ out_free_pages: goto out; } +void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long address, pmd_t *pmd, pmd_t orig_pmd) +{ + pmd_t entry; + + spin_lock(>page_table_lock); + entry = pmd_mkyoung(orig_pmd); + if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry, 0)) + update_mmu_cache(vma, address, pmd); + spin_unlock(>page_table_lock); +} + int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, pmd_t *pmd, pmd_t orig_pmd) { diff --git a/mm/memory.c b/mm/memory.c index 5736170..d5c007d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3537,7 +3537,11 @@ retry: if (unlikely(ret & VM_FAULT_OOM)) goto retry; return ret; + } else { + huge_pmd_set_accessed(mm, vma, address, pmd, + orig_pmd); } + return 0; } } -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: thp: Set the accessed flag for old pages on access fault.
From: Steve Capper steve.cap...@arm.com On x86 memory accesses to pages without the ACCESSED flag set result in the ACCESSED flag being set automatically. With the ARM architecture a page access fault is raised instead (and it will continue to be raised until the ACCESSED flag is set for the appropriate PTE/PMD). For normal memory pages, handle_pte_fault will call pte_mkyoung (effectively setting the ACCESSED flag). For transparent huge pages, pmd_mkyoung will only be called for a write fault. This patch ensures that faults on transparent hugepages which do not result in a CoW update the access flags for the faulting pmd. Cc: Andrea Arcangeli aarca...@redhat.com Cc: Chris Metcalf cmetc...@tilera.com Signed-off-by: Steve Capper steve.cap...@arm.com Signed-off-by: Will Deacon will.dea...@arm.com --- Hello again, This is another fix for an issue that we discovered when porting THP to ARM but it somehow managed to slip through the cracks. Will include/linux/huge_mm.h |3 +++ mm/huge_memory.c| 12 mm/memory.c |4 3 files changed, 19 insertions(+), 0 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 4c59b11..bbc62ad 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -8,6 +8,9 @@ extern int do_huge_pmd_anonymous_page(struct mm_struct *mm, extern int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr, struct vm_area_struct *vma); +extern void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long address, pmd_t *pmd, + pmd_t orig_pmd); extern int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, pmd_t *pmd, pmd_t orig_pmd); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index d684934..ee9cc3b 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -914,6 +914,18 @@ out_free_pages: goto out; } +void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long address, pmd_t *pmd, pmd_t orig_pmd) +{ + pmd_t entry; + + spin_lock(mm-page_table_lock); + entry = pmd_mkyoung(orig_pmd); + if (pmdp_set_access_flags(vma, address HPAGE_PMD_MASK, pmd, entry, 0)) + update_mmu_cache(vma, address, pmd); + spin_unlock(mm-page_table_lock); +} + int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, pmd_t *pmd, pmd_t orig_pmd) { diff --git a/mm/memory.c b/mm/memory.c index 5736170..d5c007d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3537,7 +3537,11 @@ retry: if (unlikely(ret VM_FAULT_OOM)) goto retry; return ret; + } else { + huge_pmd_set_accessed(mm, vma, address, pmd, + orig_pmd); } + return 0; } } -- 1.7.4.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: thp: Set the accessed flag for old pages on access fault.
On Mon, Oct 01, 2012 at 02:51:45PM +0100, Will Deacon wrote: diff --git a/mm/memory.c b/mm/memory.c index 5736170..d5c007d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3537,7 +3537,11 @@ retry: if (unlikely(ret VM_FAULT_OOM)) goto retry; return ret; + } else { + huge_pmd_set_accessed(mm, vma, address, pmd, + orig_pmd); I think putting it to 'else' is wrong. You should not touch pmd, if it's under splitting. } + return 0; } } -- Kirill A. Shutemov -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: thp: Set the accessed flag for old pages on access fault.
Hi Will, On Mon, Oct 01, 2012 at 02:51:45PM +0100, Will Deacon wrote: +void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma, +unsigned long address, pmd_t *pmd, pmd_t orig_pmd) +{ + pmd_t entry; + + spin_lock(mm-page_table_lock); + entry = pmd_mkyoung(orig_pmd); + if (pmdp_set_access_flags(vma, address HPAGE_PMD_MASK, pmd, entry, 0)) + update_mmu_cache(vma, address, pmd); If the pmd is being splitted, this may not be a trasnhuge pmd anymore by the time you obtained the lock. (orig_pmd could be stale, and it wasn't verified with pmd_same either) The lock should be obtained through pmd_trans_huge_lock. if (pmd_trans_huge_lock(orig_pmd, vma) == 1) { set young bit spin_unlock(mm-page_table_lock); } On x86: int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t entry, int dirty) { int changed = !pmd_same(*pmdp, entry); VM_BUG_ON(address ~HPAGE_PMD_MASK); if (changed dirty) { *pmdp = entry; with dirty == 0 it looks like it won't make any difference, but I guess your arm pmdp_set_access_flag is different. However it seems dirty means write access and so the invocation would better match the pte case: if (pmdp_set_access_flags(vma, address HPAGE_PMD_MASK, pmd, entry, flags FAULT_FLAG_WRITE)) But note, you still have to update it even when dirty == 0, or it'll still infinite loop for read accesses. + spin_unlock(mm-page_table_lock); +} + int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, pmd_t *pmd, pmd_t orig_pmd) { diff --git a/mm/memory.c b/mm/memory.c index 5736170..d5c007d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3537,7 +3537,11 @@ retry: if (unlikely(ret VM_FAULT_OOM)) goto retry; return ret; + } else { + huge_pmd_set_accessed(mm, vma, address, pmd, + orig_pmd); } + return 0; Thanks, Andrea -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: thp: Set the accessed flag for old pages on access fault.
On Mon, Oct 01, 2012 at 03:59:44PM +0100, Andrea Arcangeli wrote: Hi Will, Hi Andrea, Kirill, Thanks for the comments. On Mon, Oct 01, 2012 at 02:51:45PM +0100, Will Deacon wrote: +void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long address, pmd_t *pmd, pmd_t orig_pmd) +{ + pmd_t entry; + + spin_lock(mm-page_table_lock); + entry = pmd_mkyoung(orig_pmd); + if (pmdp_set_access_flags(vma, address HPAGE_PMD_MASK, pmd, entry, 0)) + update_mmu_cache(vma, address, pmd); If the pmd is being splitted, this may not be a trasnhuge pmd anymore by the time you obtained the lock. (orig_pmd could be stale, and it wasn't verified with pmd_same either) The lock should be obtained through pmd_trans_huge_lock. if (pmd_trans_huge_lock(orig_pmd, vma) == 1) { set young bit spin_unlock(mm-page_table_lock); } I didn't notice that -- thanks. I'll move the locking outside of the _set_accessed function and direct it via that function instead. On x86: int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t entry, int dirty) { int changed = !pmd_same(*pmdp, entry); VM_BUG_ON(address ~HPAGE_PMD_MASK); if (changed dirty) { *pmdp = entry; with dirty == 0 it looks like it won't make any difference, but I guess your arm pmdp_set_access_flag is different. We use the generic code, which ignores the dirty argument. Still, we should pass the correct value through anyway, so I'll fix that too. However it seems dirty means write access and so the invocation would better match the pte case: if (pmdp_set_access_flags(vma, address HPAGE_PMD_MASK, pmd, entry, flags FAULT_FLAG_WRITE)) But note, you still have to update it even when dirty == 0, or it'll still infinite loop for read accesses. Yup. v2 to follow once we've re-run our testing. Cheers, Will -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/