On 06/20/2018 12:56 PM, Rik van Riel wrote:
> CPUs in TLBSTATE_OK have either received TLB flush IPIs earlier on during
> the munmap (when the user memory was unmapped), or have context switched
> and reloaded during that stage of the munmap.
> 
> Page table free TLB flushes only need to be sent to CPUs in lazy TLB
> mode, which TLB contents might not yet be up to date yet.
> 
> Signed-off-by: Rik van Riel <[email protected]>
> Tested-by: Song Liu <[email protected]>
> ---
>  arch/x86/mm/tlb.c | 26 +++++++++++++++++++++++---
>  1 file changed, 23 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 4b27d8469848..40b00055c883 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -752,11 +752,31 @@ void tlb_flush_remove_tables_local(void *arg)
>  void tlb_flush_remove_tables(struct mm_struct *mm)
>  {
>       int cpu = get_cpu();
> +     cpumask_var_t varmask;
> +
> +     if (cpumask_any_but(mm_cpumask(mm), cpu) >= nr_cpu_ids)
> +             return;

Oh, man, we need a wrapper for that little nugget.  I was really
scratching my head about what this was doing until I saw the
cpumask_any_but() comment:

         * Returns >= nr_cpu_ids if no cpus set.


> +     if (!zalloc_cpumask_var(&varmask, GFP_ATOMIC)) {
> +             /* Flush the TLB on all CPUs that have this mm loaded. */
> +             smp_call_function_many(mm_cpumask(mm), 
> tlb_flush_remove_tables_local, (void *)mm, 1);
> +     }

Maybe:

        /*
         * If the cpumask allocation fails, do a brute-force
         * flush on all CPUs that have this mm loaded.
         */

Also, don't you need a return inside the if(){} block here?

>       /*
> -      * XXX: this really only needs to be called for CPUs in lazy TLB mode.
> +      * CPUs in TLBSTATE_OK either received a TLB flush IPI while the user
> +      * pages in this address range were unmapped, or have context switched
> +      * and reloaded %CR3 since then.
> +      *
> +      * Shootdown IPIs at page table freeing time only need to be sent to
> +      * CPUs that may have out of date TLB contents.
>        */
> -     if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids)
> -             smp_call_function_many(mm_cpumask(mm), 
> tlb_flush_remove_tables_local, (void *)mm, 1);
> +     for_each_cpu(cpu, mm_cpumask(mm)) {
> +             if (per_cpu(cpu_tlbstate.state, cpu) != TLBSTATE_OK)
> +                     cpumask_set_cpu(cpu, varmask);
> +     }
> +
> +     smp_call_function_many(varmask, tlb_flush_remove_tables_local, (void 
> *)mm, 1);
> +     free_cpumask_var(varmask);
>  }

Would this look any better if you wrapped the mask manipulation in a:

void mm_fill_lazy_tlb_cpu_mask(struct mm_struct *mm,
                               cpumask_var_t *varmask)

helper, including the explanation comments?

        cpumask_var_t lazy_cpus;
        
        ... cpumask allocation/fallback here

        mm_fill_lazy_tlb_cpu_mask(mm, &lazy_cpus);
        smp_call_function_many(lazy_cpus, tlb_flush_remove_...

Reply via email to