Re: [PATCH v5 00/11] Introduces new count-based method for tracking lockless pagetable walks

2019-10-04 Thread Peter Zijlstra
On Fri, Oct 04, 2019 at 01:42:36PM +0200, Peter Zijlstra wrote: > If you can find anything there that isn't right, please explain that in > detail and we'll need to look hard at fixing _that_. Also, I can't imagine Nick is happy with 128 CPUs banging on that atomic counter, esp. since atomics are

Re: [PATCH v5 00/11] Introduces new count-based method for tracking lockless pagetable walks

2019-10-04 Thread Peter Zijlstra
On Thu, Oct 03, 2019 at 05:36:31PM -0300, Leonardo Bras wrote: > > Also, I'm not sure I understand things properly. > > > > So serialize_against_pte_lookup() wants to wait for all currently > > out-standing __find_linux_pte() instances (which are very similar to > > gup_fast). > > > > It seems

Re: [PATCH v5 00/11] Introduces new count-based method for tracking lockless pagetable walks

2019-10-03 Thread Leonardo Bras
On Thu, 2019-10-03 at 13:49 -0700, John Hubbard wrote: > Yes. And to clarify, I was assuming that the changes to mm/gup.c were > required in order to accomplish your goals. Given that assumption, I > wanted the generic code to be "proper", and that's what that feedback > is about. You assumed

Re: [PATCH v5 00/11] Introduces new count-based method for tracking lockless pagetable walks

2019-10-03 Thread John Hubbard
On 10/3/19 1:36 PM, Leonardo Bras wrote: > On Thu, 2019-10-03 at 09:29 +0200, Peter Zijlstra wrote: >> On Wed, Oct 02, 2019 at 10:33:14PM -0300, Leonardo Bras wrote: ... >> This is something entirely specific to Power, you shouldn't be touching >> generic code at all. > > Up to v4, I was

Re: [PATCH v5 00/11] Introduces new count-based method for tracking lockless pagetable walks

2019-10-03 Thread Leonardo Bras
Hello Peter, thanks for the feedback! On Thu, 2019-10-03 at 09:29 +0200, Peter Zijlstra wrote: > On Wed, Oct 02, 2019 at 10:33:14PM -0300, Leonardo Bras wrote: > > If a process (qemu) with a lot of CPUs (128) try to munmap() a large > > chunk of memory (496GB) mapped with THP, it takes an average

Re: [PATCH v5 00/11] Introduces new count-based method for tracking lockless pagetable walks

2019-10-03 Thread Peter Zijlstra
On Wed, Oct 02, 2019 at 10:33:14PM -0300, Leonardo Bras wrote: > If a process (qemu) with a lot of CPUs (128) try to munmap() a large > chunk of memory (496GB) mapped with THP, it takes an average of 275 > seconds, which can cause a lot of problems to the load (in qemu case, > the guest will lock