et>, Chris Zankel <ch...@zankel.net>, Michal Simek <mon...@monstr.eu>, Thomas 
Bogendoerfer <tsbog...@alpha.franken.de>, linux-par...@vger.kernel.org, Max 
Filippov <jcmvb...@gmail.com>, linux-ker...@vger.kernel.org, Dinh Nguyen 
<dingu...@kernel.org>, Palmer Dabbelt <pal...@dabbelt.com>, Sven Schnelle 
<sv...@linux.ibm.com>, Guo Ren <guo...@kernel.org>, Borislav Petkov 
<b...@alien8.de>, Johannes Berg <johan...@sipsolutions.net>, 
linuxppc-dev@lists.ozlabs.org, "David S . Miller" <da...@davemloft.net>
Errors-To: linuxppc-dev-bounces+archive=mail-archive....@lists.ozlabs.org
Sender: "Linuxppc-dev" 
<linuxppc-dev-bounces+archive=mail-archive....@lists.ozlabs.org>


* Peter Xu <pet...@redhat.com> wrote:

> This patch provides a ~12% perf boost on my aarch64 test VM with a simple
> program sequentially dirtying 400MB shmem file being mmap()ed and these are
> the time it needs:
>
>   Before: 650.980 ms (+-1.94%)
>   After:  569.396 ms (+-1.38%)

Nice!

>  arch/x86/mm/fault.c           |  4 ++++

Reviewed-by: Ingo Molnar <mi...@kernel.org>

Minor comment typo:

> +             /*
> +              * We should do the same as VM_FAULT_RETRY, but let's not
> +              * return -EBUSY since that's not reflecting the reality on
> +              * what has happened - we've just fully completed a page
> +              * fault, with the mmap lock released.  Use -EAGAIN to show
> +              * that we want to take the mmap lock _again_.
> +              */

s/reflecting the reality on what has happened
 /reflecting the reality of what has happened

>       ret = handle_mm_fault(vma, address, fault_flags, NULL);
> +
> +     if (ret & VM_FAULT_COMPLETED) {
> +             /*
> +              * NOTE: it's a pity that we need to retake the lock here
> +              * to pair with the unlock() in the callers. Ideally we
> +              * could tell the callers so they do not need to unlock.
> +              */
> +             mmap_read_lock(mm);
> +             *unlocked = true;
> +             return 0;

Indeed that's a pity - I guess more performance could be gained here, 
especially in highly parallel threaded workloads?

Thanks,

        Ingo

Reply via email to