On Sun, 2012-10-21 at 05:56 -0700, tip-bot for Andrea Arcangeli wrote:

> In get_user_pages_fast() the TLB shootdown code can clear the pagetables
> before firing any TLB flush (the page can't be freed until the TLB
> flushing IPI has been delivered but the pagetables will be cleared well
> before sending any TLB flushing IPI).

I think we want to do this for all gup_fast() implementations. When I
reported this issue I also proposed adding something like
page_table_deref() which we could use through-out. Not sure we want to,
but at least all archs need an audit for this.


> ---
>  arch/x86/mm/gup.c |   23 ++++++++++++++++++++---
>  mm/memory.c       |    2 +-
>  2 files changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c
> index dd74e46..6dc9921 100644
> --- a/arch/x86/mm/gup.c
> +++ b/arch/x86/mm/gup.c
> @@ -150,7 +150,13 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, 
> unsigned long end,
>  
>       pmdp = pmd_offset(&pud, addr);
>       do {
> -             pmd_t pmd = *pmdp;
> +             /*
> +              * With THP and hugetlbfs the pmd can change from
> +              * under us and it can be cleared as well by the TLB
> +              * shootdown, so read it with ACCESS_ONCE to do all
> +              * computations on the same sampling.
> +              */
> +             pmd_t pmd = ACCESS_ONCE(*pmdp);
>  
>               next = pmd_addr_end(addr, end);
>               /*
> @@ -220,7 +226,13 @@ static int gup_pud_range(pgd_t pgd, unsigned long addr, 
> unsigned long end,
>  
>       pudp = pud_offset(&pgd, addr);
>       do {
> -             pud_t pud = *pudp;
> +             /*
> +              * With hugetlbfs giga pages the pud can change from
> +              * under us and it can be cleared as well by the TLB
> +              * shootdown, so read it with ACCESS_ONCE to do all
> +              * computations on the same sampling.
> +              */
> +             pud_t pud = ACCESS_ONCE(*pudp);
>  
>               next = pud_addr_end(addr, end);
>               if (pud_none(pud))
> @@ -280,7 +292,12 @@ int __get_user_pages_fast(unsigned long start, int 
> nr_pages, int write,
>       local_irq_save(flags);
>       pgdp = pgd_offset(mm, addr);
>       do {
> -             pgd_t pgd = *pgdp;
> +             /*
> +              * The pgd could be cleared by the TLB shootdown from
> +              * under us so read it with ACCESS_ONCE to do all
> +              * computations on the same sampling.
> +              */
> +             pgd_t pgd = ACCESS_ONCE(*pgdp);
>  
>               next = pgd_addr_end(addr, end);
>               if (pgd_none(pgd))
> diff --git a/mm/memory.c b/mm/memory.c
> index cc8e280..c0de477 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3555,7 +3555,7 @@ int handle_pte_fault(struct mm_struct *mm,
>       pte_t entry;
>       spinlock_t *ptl;
>  
> -     entry = *pte;
> +     entry = ACCESS_ONCE(*pte);
>       if (!pte_present(entry)) {
>               if (pte_none(entry)) {
>                       if (vma->vm_ops) {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to