Re: [PATCH v2 12/14] mm/treewide: Remove pXd_huge()

2024-05-27 Thread Peter Xu
On Mon, May 27, 2024 at 06:03:30AM +, Christophe Leroy wrote: > > > Le 18/03/2024 à 21:04, pet...@redhat.com a écrit : > > From: Peter Xu > > > > This API is not used anymore, drop it for the whole tree. > > Some documentation remain in v6.10

Re: [RFC PATCH v2 00/20] Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64)

2024-05-27 Thread Peter Xu
lp, he at least knows mm better than me, but he also has other work. > > Hopefully we can make this series work, and replace hugepd. But if we > can't make that work then there is the possibility of just dropping > support for 16M/16G pages with HPT/4K pages. Great, thank you! -- Peter Xu

Re: [RFC PATCH v2 00/20] Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64)

2024-05-23 Thread Peter Xu
new code. Currently Oscar offered help on that hugetlb project, and Oscar will start to work on page_walk API refactoring. I guess currently the simple way is we'll work on top of Christophe's series. Some proper review on this series will definitely make it clearer on what we should do next. Thanks, -- Peter Xu

Re: [PATCH v2 1/1] arch/fault: don't print logs for pte marker poison errors

2024-05-23 Thread Peter Xu
On Thu, May 23, 2024 at 05:08:29AM +0200, Oscar Salvador wrote: > On Wed, May 22, 2024 at 05:46:09PM -0400, Peter Xu wrote: > > > Now, ProcessB still has the page mapped, so upon re-accessing it, > > > it will trigger a new MCE event. memory-failure code will see that this

Re: [PATCH v2 1/1] arch/fault: don't print logs for pte marker poison errors

2024-05-22 Thread Peter Xu
then KVM work naturally with that just like a real MCE. One other thing we can do is to inject-poison to the VA together with the page backing it, but that'll pollute a PFN on dst host to be a real bad PFN and won't be able to be used by the dst OS anymore, so it's less optimal. Thanks, -- Peter Xu

Re: [PATCH v2 1/1] arch/fault: don't print logs for pte marker poison errors

2024-05-22 Thread Peter Xu
On Wed, May 15, 2024 at 12:21:51PM +0200, Oscar Salvador wrote: > On Tue, May 14, 2024 at 03:34:24PM -0600, Peter Xu wrote: > > The question is whether we can't. > > > > Now we reserved a swp entry just for hwpoison and it makes sense only > > because we cached the po

Re: [PATCH v2 1/1] arch/fault: don't print logs for pte marker poison errors

2024-05-14 Thread Peter Xu
On Tue, May 14, 2024 at 10:26:49PM +0200, Oscar Salvador wrote: > On Fri, May 10, 2024 at 03:29:48PM -0400, Peter Xu wrote: > > IMHO we shouldn't mention that detail, but only state the effect which is > > to not report the event to syslog. > > > > There's no hard r

Re: [PATCH v2 1/1] arch/fault: don't print logs for pte marker poison errors

2024-05-10 Thread Peter Xu
utually > exclusive). > > Reviewed-by: John Hubbard > Signed-off-by: Axel Rasmussen Acked-by: Peter Xu One nicpick below. > --- > arch/parisc/mm/fault.c | 7 +-- > arch/powerpc/mm/fault.c | 6 -- > arch/x86/mm/fault.c | 6 -- > include/linux/mm_t

Re: [PATCH 1/1] arch/fault: don't print logs for simulated poison errors

2024-05-09 Thread Peter Xu
AULT_SET_HINDEX(hstate_index(h)); > goto out_mutex; > } > diff --git a/mm/memory.c b/mm/memory.c > index d2155ced45f8..29a833b996ae 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3910,7 +3910,7 @@ static vm_fault_t handle_pte_marker(struct vm_fault > *vmf) > > /* Higher priority than uffd-wp when data corrupted */ > if (marker & PTE_MARKER_POISONED) > - return VM_FAULT_HWPOISON; > + return VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_SIM; > > if (pte_marker_entry_uffd_wp(entry)) > return pte_marker_handle_uffd_wp(vmf); > -- > 2.45.0.118.g7fe29c98d7-goog > -- Peter Xu

[PATCH v2] mm/gup: Fix hugepd handling in hugetlb rework

2024-04-30 Thread Peter Xu
2083d721d7 ("mm/gup: handle hugepd for follow_page()") Reviewed-by: David Hildenbrand Signed-off-by: Peter Xu --- v1: https://lore.kernel.org/r/20240428190151.201002-1-pet...@redhat.com This is v2 and dropped the 2nd test patch as a better one can come later, this patch alone is k

Re: [PATCH 2/2] mm/selftests: Don't prefault in gup_longterm tests

2024-04-29 Thread Peter Xu
are? IIUC it used to be not > > touched because of pte_write() always returns true with a write prefault. > > > > Then we let patch 1 go through first, and drop this one? > > Whatever you prefer! Thanks! Andrew, would you consider taking patch 1 but ignore this patch 2? Or do you prefer me to resend? -- Peter Xu

Re: [PATCH 2/2] mm/selftests: Don't prefault in gup_longterm tests

2024-04-29 Thread Peter Xu
On Mon, Apr 29, 2024 at 09:28:15AM +0200, David Hildenbrand wrote: > On 28.04.24 21:01, Peter Xu wrote: > > Prefault, especially with RW, makes the GUP test too easy, and may not yet > > reach the core of the test. > > > > For example, R/O longterm pins will

[PATCH 1/2] mm/gup: Fix hugepd handling in hugetlb rework

2024-04-28 Thread Peter Xu
2083d721d7 ("mm/gup: handle hugepd for follow_page()") Signed-off-by: Peter Xu --- Note: The target commit to be fixed should just been moved into mm-stable, so no need to cc stable. --- mm/gup.c | 64 ++-- 1 file changed, 39 inserti

[PATCH 0/2] mm/gup: Fix hugepd for longterm R/O pin on Power

2024-04-28 Thread Peter Xu
16MB huge page. Thanks, [1] https://lore.kernel.org/r/20240327152332.950956-1-pet...@redhat.com Peter Xu (2): mm/gup: Fix hugepd handling in hugetlb rework mm/selftests: Don't prefault in gup_longterm tests mm/gup.c | 64 ++- tools/testing

[PATCH 2/2] mm/selftests: Don't prefault in gup_longterm tests

2024-04-28 Thread Peter Xu
at least to cover the unshare care for R/O longterm pins, in which case the first R/O GUP attempt will fault in the page R/O first, then the 2nd will go through the unshare path, checking whether an unshare is needed. Cc: David Hildenbrand Signed-off-by: Peter Xu --- tools/testing/selftests/mm

Re: [PATCH v1 1/3] mm/gup: consistently name GUP-fast functions

2024-04-26 Thread Peter Xu
fix on hugepd putting this aside. I hope that before the end of this year, whatever I'll fix can go away, by removing hugepd completely from Linux. For now that may or may not be as smooth, so we'd better still fix it. -- Peter Xu

Re: [PATCH v1 1/3] mm/gup: consistently name GUP-fast functions

2024-04-26 Thread Peter Xu
On Fri, Apr 26, 2024 at 07:28:31PM +0200, David Hildenbrand wrote: > On 26.04.24 18:12, Peter Xu wrote: > > On Fri, Apr 26, 2024 at 09:44:58AM -0400, Peter Xu wrote: > > > On Fri, Apr 26, 2024 at 09:17:47AM +0200, David Hildenbrand wrote: > > > > On 02.04.24

Re: [PATCH v1 1/3] mm/gup: consistently name GUP-fast functions

2024-04-26 Thread Peter Xu
On Fri, Apr 26, 2024 at 09:44:58AM -0400, Peter Xu wrote: > On Fri, Apr 26, 2024 at 09:17:47AM +0200, David Hildenbrand wrote: > > On 02.04.24 14:55, David Hildenbrand wrote: > > > Let's consistently call the "fast-only" part of GUP "GUP-fast" and rename

Re: [PATCH v1 1/3] mm/gup: consistently name GUP-fast functions

2024-04-26 Thread Peter Xu
gup_hugepte() -> gup_fast_hugepte() > > I just realized that we end up calling these from follow_hugepd() as well. > And something seems to be off, because gup_fast_hugepd() won't have the VMA > even in the slow-GUP case to pass it to gup_must_unshare(). > > So these are GUP-fast fu

Re: [RFC PATCH 0/8] Reimplement huge pages without hugepd on powerpc 8xx

2024-04-16 Thread Peter Xu
On Tue, Apr 16, 2024 at 10:58:33AM +, Christophe Leroy wrote: > > > Le 15/04/2024 à 21:12, Christophe Leroy a écrit : > > > > > > Le 12/04/2024 à 16:30, Peter Xu a écrit : > >> On Fri, Apr 12, 2024 at 02:08:03PM +, Christophe Leroy wrote: > >&

Re: [RFC PATCH 0/8] Reimplement huge pages without hugepd on powerpc 8xx

2024-04-12 Thread Peter Xu
On Fri, Apr 12, 2024 at 02:08:03PM +, Christophe Leroy wrote: > > > Le 11/04/2024 à 18:15, Peter Xu a écrit : > > On Mon, Mar 25, 2024 at 01:38:40PM -0300, Jason Gunthorpe wrote: > >> On Mon, Mar 25, 2024 at 03:55:53PM +0100, Christophe Leroy wrote: > >>>

Re: [PATCH 1/4] KVM: delete .change_pte MMU notifier callback

2024-04-11 Thread Peter Xu
On Thu, Apr 11, 2024 at 06:55:44PM +0200, Paolo Bonzini wrote: > On Mon, Apr 8, 2024 at 3:56 PM Peter Xu wrote: > > Paolo, > > > > I may miss a bunch of details here (as I still remember some change_pte > > patches previously on the list..), however not sure wheth

Re: [RFC PATCH 0/8] Reimplement huge pages without hugepd on powerpc 8xx

2024-04-11 Thread Peter Xu
ition my next step; it seems like at least I should not adding any more hugepd code, then should I go with ARCH_HAS_HUGEPD checks, or you're going to have an RFC soon then I can base on top? Thanks, -- Peter Xu

Re: [PATCH v3 00/12] mm/gup: Unify hugetlb, part 2

2024-04-10 Thread Peter Xu
On Wed, Apr 10, 2024 at 04:30:41PM +, Christophe Leroy wrote: > > > Le 10/04/2024 à 17:28, Peter Xu a écrit : > > On Tue, Apr 09, 2024 at 08:43:55PM -0300, Jason Gunthorpe wrote: > >> On Fri, Apr 05, 2024 at 05:42:44PM -0400, Peter Xu wrote: > >>>

Re: [PATCH v3 00/12] mm/gup: Unify hugetlb, part 2

2024-04-10 Thread Peter Xu
On Tue, Apr 09, 2024 at 08:43:55PM -0300, Jason Gunthorpe wrote: > On Fri, Apr 05, 2024 at 05:42:44PM -0400, Peter Xu wrote: > > In short, hugetlb mappings shouldn't be special comparing to other huge pXd > > and large folio (cont-pXd) mappings for most of the walkers in my mind,

Re: [PATCH 1/4] KVM: delete .change_pte MMU notifier callback

2024-04-08 Thread Peter Xu
ecause I remember Andrea used to have a custom tree maintaining that part: https://github.com/aagit/aa/commit/c761078df7a77d13ddfaeebe56a0f4bc128b1968 Maybe it can't be enabled for some reason that I overlooked in the current tree, or we just decided to not to? Thanks, -- Peter Xu

Re: [PATCH v3 00/12] mm/gup: Unify hugetlb, part 2

2024-04-05 Thread Peter Xu
On Fri, Apr 05, 2024 at 03:16:33PM -0300, Jason Gunthorpe wrote: > On Thu, Apr 04, 2024 at 05:48:03PM -0400, Peter Xu wrote: > > On Tue, Mar 26, 2024 at 11:02:52AM -0300, Jason Gunthorpe wrote: > > > The more I look at this the more I think we need to get to Matthew's > &g

Re: [PATCH v3 00/12] mm/gup: Unify hugetlb, part 2

2024-04-04 Thread Peter Xu
Considering that we already have most of pmd/pud entries around in the mm walker ops. So far it sounds better we leave it for later, until further justifed to be useful. And that won't block it if it ever justified to be needed, I'd say it can also be seen as a step forward if I can make it to remove hugetlb_entry() first. Comments welcomed (before I start to work on anything..). Thanks, -- Peter Xu

Re: [PATCH v4 05/13] mm/arch: Provide pud_pfn() fallback

2024-04-04 Thread Peter Xu
On Thu, Apr 04, 2024 at 08:24:04AM -0300, Jason Gunthorpe wrote: > On Wed, Apr 03, 2024 at 02:25:20PM -0400, Peter Xu wrote: > > > > I'd say the BUILD_BUG has done it's job and found an issue, fix it by > > > not defining pud_leaf? I don't see any calls to pud_leaf in

Re: [PATCH v4 05/13] mm/arch: Provide pud_pfn() fallback

2024-04-03 Thread Peter Xu
On Wed, Apr 03, 2024 at 09:08:41AM -0300, Jason Gunthorpe wrote: > On Tue, Apr 02, 2024 at 07:35:45PM -0400, Peter Xu wrote: > > On Tue, Apr 02, 2024 at 07:53:20PM -0300, Jason Gunthorpe wrote: > > > On Tue, Apr 02, 2024 at 06:43:56PM -0400, Peter Xu wrote: > > >

Re: [PATCH v4 05/13] mm/arch: Provide pud_pfn() fallback

2024-04-02 Thread Peter Xu
On Tue, Apr 02, 2024 at 07:53:20PM -0300, Jason Gunthorpe wrote: > On Tue, Apr 02, 2024 at 06:43:56PM -0400, Peter Xu wrote: > > > I actually tested this without hitting the issue (even though I didn't > > mention it in the cover letter..). I re-kicked the build test, it turn

Re: [PATCH v4 05/13] mm/arch: Provide pud_pfn() fallback

2024-04-02 Thread Peter Xu
On Tue, Apr 02, 2024 at 12:05:49PM -0700, Nathan Chancellor wrote: > Hi Peter (and LoongArch folks), > > On Wed, Mar 27, 2024 at 11:23:24AM -0400, pet...@redhat.com wrote: > > From: Peter Xu > > > > The comment in the code explains the reasons. We took a diffe

Re: [PATCH v4 13/13] mm/gup: Handle hugetlb in the generic follow_page_mask code

2024-04-02 Thread Peter Xu
s for the tests: > > "transparent_hugepage=madvise earlycon root=/dev/vda2 secretmem.enable > hugepagesz=1G hugepages=0:2,1:2 hugepagesz=32M hugepages=0:2,1:2 > default_hugepagesz=2M hugepages=0:64,1:64 hugepagesz=64K hugepages=0:2,1:2" This helps, thanks. -- Peter Xu

Re: [PATCH v4 13/13] mm/gup: Handle hugetlb in the generic follow_page_mask code

2024-04-02 Thread Peter Xu
On Tue, Apr 02, 2024 at 06:39:31PM +0200, David Hildenbrand wrote: > On 02.04.24 18:20, Peter Xu wrote: > > On Tue, Apr 02, 2024 at 05:26:28PM +0200, David Hildenbrand wrote: > > > On 02.04.24 16:48, Ryan Roberts wrote: > > > > Hi Peter, > > > >

Re: [PATCH v4 13/13] mm/gup: Handle hugetlb in the generic follow_page_mask code

2024-04-02 Thread Peter Xu
pmd(), called just after the assert I > just commented out. > > > It's triggered by this test: > > # [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with memfd > hugetlb (32768 kB) > > Which is the first MAP_PRIVATE test for cont-pmd mapped hugetlb. (All > MAP_SHARED tests are passing). > > > Looks like can_follow_write_pmd() returns early for VM_SHARED mappings. > > I don't think we only keep the PAE flag in the head page for hugetlb pages? > So we can't just remove this assert? > > I tried just commenting it out and get assert further down follow_huge_pmd(): > > VM_BUG_ON_PAGE((flags & FOLL_PIN) && PageAnon(page) && > !PageAnonExclusive(page), page); I just replied in another email; we can try the two patches I attached, or we can wait until I do some tests (but will be mostly unavailable this afternoon). Thanks, -- Peter Xu

Re: [PATCH v4 13/13] mm/gup: Handle hugetlb in the generic follow_page_mask code

2024-04-02 Thread Peter Xu
On Tue, Apr 02, 2024 at 05:26:28PM +0200, David Hildenbrand wrote: > On 02.04.24 16:48, Ryan Roberts wrote: > > Hi Peter, Hey, Ryan, Thanks for the report! > > > > On 27/03/2024 15:23, pet...@redhat.com wrote: > > > From: Peter Xu > > > > > &

Re: [PATCH RFC 0/3] mm/gup: consistently call it GUP-fast

2024-03-27 Thread Peter Xu
ere; as I am doing some build tests recently, I found turning off CONFIG_SAMPLES + CONFIG_GCC_PLUGINS could avoid a lot of issues, I think it's due to libc missing. But maybe not the case there. The series makes sense to me, the naming is confusing. Btw, thanks for posting this as RFC. This definitely has a conflict with the other gup series that I had; I'll post v4 of that shortly. -- Peter Xu

Re: [PATCH v3 00/12] mm/gup: Unify hugetlb, part 2

2024-03-25 Thread Peter Xu
On Fri, Mar 22, 2024 at 01:10:00PM -0300, Jason Gunthorpe wrote: > On Thu, Mar 21, 2024 at 06:07:50PM -0400, pet...@redhat.com wrote: > > From: Peter Xu > > > > v3: > > - Rebased to latest mm-unstalbe (a824831a082f, of March 21th) > > - Dropped patch to int

Re: [PATCH v3 12/12] mm/gup: Handle hugetlb in the generic follow_page_mask code

2024-03-22 Thread Peter Xu
On Fri, Mar 22, 2024 at 08:45:59PM -0400, Peter Xu wrote: > On Fri, Mar 22, 2024 at 01:48:18PM -0700, Andrew Morton wrote: > > On Thu, 21 Mar 2024 18:08:02 -0400 pet...@redhat.com wrote: > > > > > From: Peter Xu > > > > > > Now follow_page() is read

Re: [PATCH v3 12/12] mm/gup: Handle hugetlb in the generic follow_page_mask code

2024-03-22 Thread Peter Xu
On Fri, Mar 22, 2024 at 01:48:18PM -0700, Andrew Morton wrote: > On Thu, 21 Mar 2024 18:08:02 -0400 pet...@redhat.com wrote: > > > From: Peter Xu > > > > Now follow_page() is ready to handle hugetlb pages in whatever form, and > > over all architectures. S

Re: [PATCH v3 03/12] mm: Make HPAGE_PXD_* macros even if !THP

2024-03-22 Thread Peter Xu
On Fri, Mar 22, 2024 at 10:14:56AM -0700, SeongJae Park wrote: > Hi Peter, Hi, SeongJae, > > On Thu, 21 Mar 2024 18:07:53 -0400 pet...@redhat.com wrote: > > > From: Peter Xu > > > > These macros can be helpful when we plan to merge hugetlb code into ge

Re: [PATCH v3 12/12] mm/gup: Handle hugetlb in the generic follow_page_mask code

2024-03-22 Thread Peter Xu
issue to solve, IOW, we still don't do that for !hugetlb cont_pte large folios, before or after this series. > > Reviewed-by: Jason Gunthorpe Thanks! -- Peter Xu

Re: [PATCH 09/13] mm/powerpc: Redefine pXd_huge() with pXd_leaf()

2024-03-20 Thread Peter Xu
On Wed, Mar 20, 2024 at 05:40:39PM +, Christophe Leroy wrote: > > > Le 20/03/2024 à 17:09, Peter Xu a écrit : > > On Wed, Mar 20, 2024 at 06:16:43AM +, Christophe Leroy wrote: > >> At the first place that was to get a close fit between hardware > >> paget

Re: [PATCH 09/13] mm/powerpc: Redefine pXd_huge() with pXd_leaf()

2024-03-20 Thread Peter Xu
ilar with 8M pages. > > I'll give it a try and see how it goes. So you're talking about 8M only for 8xx, am I right? There seem to be other PowerPC systems use hugepd. Is it possible that we convert all hugepd into cont_pte form? Thanks, -- Peter Xu

Re: [PATCH v2 05/14] mm/sparc: Change pXd_huge() behavior to exclude swap entries

2024-03-19 Thread Peter Xu
t; - (pmd_val(pmd) & (_PAGE_VALID|_PAGE_PMD_HUGE)) != _PAGE_VALID; > > + return pmd_leaf(pmd);; > > There is a redundant semicolon in the end. Will touch it up, thanks. PS: This will be dropped as a whole in patch 12. -- Peter Xu

Re: [PATCH 12/13] mm/treewide: Remove pXd_huge()

2024-03-14 Thread Peter Xu
On Thu, Mar 14, 2024 at 08:56:59AM +, Christophe Leroy wrote: > > > Le 13/03/2024 à 22:47, pet...@redhat.com a écrit : > > From: Peter Xu > > > > This API is not used anymore, drop it for the whole tree. > > > > Signed-off-by: Peter

Re: [PATCH 11/13] mm/treewide: Replace pXd_huge() with pXd_leaf()

2024-03-14 Thread Peter Xu
On Thu, Mar 14, 2024 at 08:50:20AM +, Christophe Leroy wrote: > > > Le 13/03/2024 à 22:47, pet...@redhat.com a écrit : > > From: Peter Xu > > > > Now after we're sure all pXd_huge() definitions are the same as pXd_leaf(), > > reuse it. Luck

Re: [PATCH 09/13] mm/powerpc: Redefine pXd_huge() with pXd_leaf()

2024-03-14 Thread Peter Xu
On Thu, Mar 14, 2024 at 08:45:34AM +, Christophe Leroy wrote: > > > Le 13/03/2024 à 22:47, pet...@redhat.com a écrit : > > From: Peter Xu > > > > PowerPC book3s 4K mostly has the same definition on both, except pXd_huge() > > constantly returns 0 for hash M

Re: [PATCH RFC 00/13] mm/treewide: Remove pXd_huge() API

2024-03-12 Thread Peter Xu
ne pgd_huge*() instead of pud_huge*(), so that it looks like the only way to provide such a treewide clean API is to properly define those APIs for aarch64, and define different pud helpers for either 3/4 levels. But I confess I don't think I fully digested all the bits. Thanks, -- Peter Xu

Re: [PATCH RFC 01/13] mm/hmm: Process pud swap entry without pud_huge()

2024-03-07 Thread Peter Xu
On Thu, Mar 07, 2024 at 02:12:33PM -0400, Jason Gunthorpe wrote: > On Wed, Mar 06, 2024 at 06:41:35PM +0800, pet...@redhat.com wrote: > > From: Peter Xu > > > > Swap pud entries do not always return true for pud_huge() for all archs. > > x86 and sparc (so far) al

Re: [PATCH RFC 09/13] mm/powerpc: Redefine pXd_huge() with pXd_leaf()

2024-03-06 Thread Peter Xu
On Wed, Mar 06, 2024 at 11:56:56PM +1100, Michael Ellerman wrote: > pet...@redhat.com writes: > > From: Peter Xu > > > > PowerPC book3s 4K mostly has the same definition on both, except pXd_huge() > > constantly returns 0 for hash MMUs. AFAICT that is fine to be re

Re: [PATCH v2 4/7] mm/x86: Drop two unnecessary pud_leaf() definitions

2024-03-04 Thread Peter Xu
On Mon, Mar 04, 2024 at 09:03:34AM -0400, Jason Gunthorpe wrote: > On Thu, Feb 29, 2024 at 04:42:55PM +0800, pet...@redhat.com wrote: > > From: Peter Xu > > > > pud_leaf() has a fallback macro defined in include/linux/pgtable.h already. > > Drop the extra two f

Re: [PATCH 5/5] mm/treewide: Drop pXd_large()

2024-02-28 Thread Peter Xu
in the past it was a silent confliction between the old pud_leaf() macro and pud_leaf() defintion, the macro could have silently overwrote the function. IIUC such pud_leaf() is not needed as we have a global fallback. I'll add a pre-requisite patch to remove such pXd_leaf() definitions. -- Peter Xu

Re: [PATCH 0/5] mm/treewide: Replace pXd_large() with pXd_leaf()

2024-02-28 Thread Peter Xu
On Wed, Feb 28, 2024 at 09:50:52AM +, Christophe Leroy wrote: > Le 28/02/2024 à 09:53, pet...@redhat.com a écrit : > > From: Peter Xu > > > > [based on latest akpm/mm-unstable, commit 1274e7646240] > > > > These two APIs are mostly always the same. It's c

Re: [PATCH v2 03/13] mm: Provide generic pmd_thp_or_huge()

2024-02-22 Thread Peter Xu
On Wed, Feb 21, 2024 at 08:57:53AM -0400, Jason Gunthorpe wrote: > On Wed, Feb 21, 2024 at 05:37:37PM +0800, Peter Xu wrote: > > On Mon, Jan 15, 2024 at 01:55:51PM -0400, Jason Gunthorpe wrote: > > > On Wed, Jan 03, 2024 at 05:14:13PM +0800, pet...@redhat.com wrote: >

Re: [PATCH v2 06/13] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2024-02-21 Thread Peter Xu
function in this series? When > does this re-use happen?? It's reused in patch 12 ("mm/gup: Handle hugepd for follow_page()"). Thanks, -- Peter Xu

Re: [PATCH v2 10/13] mm/gup: Handle huge pud for follow_pud_mask()

2024-02-21 Thread Peter Xu
gt; > pud = READ_ONCE(*pudp); > > - if (pud_none(pud)) > > + if (pud_none(pud) || !pud_present(pud)) > > return no_page_table(vma, flags, address); > > Isn't 'pud_none() || !pud_present()' redundent? A none pud is > non-present, by definition? Hmm yes, seems redundant. Let me drop it. > > > - if (pud_devmap(pud)) { > > + if (pud_huge(pud)) { > > ptl = pud_lock(mm, pudp); > > - page = follow_devmap_pud(vma, address, pudp, flags, > > >pgmap); > > + page = follow_huge_pud(vma, address, pudp, flags, ctx); > > spin_unlock(ptl); > > if (page) > > return page; > > Otherwise it looks OK to me > > Reviewed-by: Jason Gunthorpe Thanks! -- Peter Xu

Re: [PATCH v2 03/13] mm: Provide generic pmd_thp_or_huge()

2024-02-21 Thread Peter Xu
On Mon, Jan 15, 2024 at 01:55:51PM -0400, Jason Gunthorpe wrote: > On Wed, Jan 03, 2024 at 05:14:13PM +0800, pet...@redhat.com wrote: > > From: Peter Xu > > > > ARM defines pmd_thp_or_huge(), detecting either a THP or a huge PMD. It > > can be a helpful helper i

Re: [PATCH v2 01/13] mm/Kconfig: CONFIG_PGTABLE_HAS_HUGE_LEAVES

2024-01-22 Thread Peter Xu
On Mon, Jan 15, 2024 at 01:37:37PM -0400, Jason Gunthorpe wrote: > On Wed, Jan 03, 2024 at 05:14:11PM +0800, pet...@redhat.com wrote: > > From: Peter Xu > > > > Introduce a config option that will be selected as long as huge leaves are > > involved in pgtable (t

Re: [PATCH v2 00/13] mm/gup: Unify hugetlb, part 2

2024-01-07 Thread Peter Xu
ut I can overlook important things here. It'll be definitely great if hugepd can be merged into some existing forms like a generic pgtable (IMHO cont_* is such case: it's the same as no cont_* entries for softwares, while hardware can accelerate with TLB hits on larger ranges). But I can be asking a very silly question here too, as I can overlook very important things. Thanks, -- Peter Xu

Re: [PATCH 05/13] mm: Introduce vma_pgtable_walk_{begin|end}()

2024-01-01 Thread Peter Xu
On Mon, Dec 25, 2023 at 02:34:48PM +0800, Muchun Song wrote: > Reviewed-by: Muchun Song You're using the old email address here. Do you want me to also use the linux.dev one that you suggested me to use? -- Peter Xu

Re: [PATCH 03/13] mm: Provide generic pmd_thp_or_huge()

2024-01-01 Thread Peter Xu
/asm/pgtable.h:#define pmd_thp_or_huge(pmd) (pmd_huge(pmd) || pmd_trans_huge(pmd)) So far this series only touches generic code. Would you mind I keep this patch as-is, and leave renaming to later? > > BTW, please cc me via the new email (muchun.s...@linux.dev) next edition. Sure. Thanks for taking a look. -- Peter Xu

Re: [PATCH 00/13] mm/gup: Unify hugetlb, part 2

2023-12-21 Thread Peter Xu
Copy Muchun, which I forgot since the start, sorry. -- Peter Xu

Re: [PATCH 09/13] mm/gup: Cache *pudp in follow_pud_mask()

2023-12-19 Thread Peter Xu
On Tue, Dec 19, 2023 at 11:28:54AM -0500, James Houghton wrote: > On Tue, Dec 19, 2023 at 2:57 AM wrote: > > > > From: Peter Xu > > > > Introduce "pud_t pud" in the function, so the code won't dereference *pudp > > multiple time. Not only becaus

Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2023-12-04 Thread Peter Xu
so we actually have three users indeed, if not counting potential future archs adding support to also get that same tlb benefit. Thanks, -- Peter Xu

Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2023-11-30 Thread Peter Xu
On Fri, Nov 24, 2023 at 11:07:51AM -0500, Peter Xu wrote: > On Fri, Nov 24, 2023 at 09:06:01AM +, Ryan Roberts wrote: > > I don't have any micro-benchmarks for GUP though, if that's your question. > > Is > > there an easy-to-use test I can run to get some numbers? I'd

Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2023-11-24 Thread Peter Xu
up, it might be relatively easy when comparing to the rest. I'm still hesitating for the long term plan. Please let me know if you have any thoughts on any of above. Thanks! -- Peter Xu

Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2023-11-24 Thread Peter Xu
ed if gup is not yet touched from your side, afaict. I'll see whether I can provide some rough numbers instead in the next post (I'll probably only be able to test it in a VM, though, but hopefully that should still reflect mostly the truth). -- Peter Xu

Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2023-11-23 Thread Peter Xu
he above series. It's a matter of whether one follow_page_mask() call can fetch more than one page* for a cont_pte entry on aarch64 for a large non-hugetlb folio (and if this series lands, it'll be the same to hugetlb or non-hugetlb). Now the current code can only fetch one page I think. Thanks, -- Peter Xu

Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2023-11-23 Thread Peter Xu
err = walk_hugetlb_range(start, end, walk); } else err = walk_pgd_range(start, end, walk); It means to me as long as the vma is hugetlb, it'll not trigger any code in walk_pgd_range(), but only walk_hugetlb_range(). Do you perhaps mean hugepd is used outside hugetlbfs? Thanks, -- Peter Xu

Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2023-11-23 Thread Peter Xu
rt for gup on large folios, and whether there's any performance number to share. It's definitely good news to me because it means Ryan's work can also then benefit hugetlb if this series will be merged, I just don't know how much difference there will be. Thanks, -- Peter Xu

Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2023-11-23 Thread Peter Xu
epd_t hugepd, unsigned long addr, unsigned int pdshift, unsigned long end, unsigned int flags, struct page **pages, int *nr) -- Peter Xu

Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2023-11-22 Thread Peter Xu
On Wed, Nov 22, 2023 at 12:00:24AM -0800, Christoph Hellwig wrote: > On Tue, Nov 21, 2023 at 10:59:35AM -0500, Peter Xu wrote: > > > What prevents us from ever using hugepd with file mappings? I think > > > it would naturally fit in with how large folios for the pagecache

Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2023-11-21 Thread Peter Xu
On Mon, Nov 20, 2023 at 12:26:24AM -0800, Christoph Hellwig wrote: > On Wed, Nov 15, 2023 at 08:29:02PM -0500, Peter Xu wrote: > > Hugepd format is only used in PowerPC with hugetlbfs. In commit > > a6e79df92e4a ("mm/gup: disallow FOLL_LONGTERM GUP-fast writing to > >

[PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2023-11-15 Thread Peter Xu
to hugepd. Drop that check, not only because it'll never be true for hugepd, but also it paves way for reusing the function outside fast-gup. Cc: Lorenzo Stoakes Cc: Michael Ellerman Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Peter Xu --- mm/gup.c | 5 - 1 file changed, 5 deletion

Re: [PATCH mm-unstable] mm/khugepaged: fix collapse_pte_mapped_thp() versus uffd

2023-08-22 Thread Peter Xu
if (userfaultfd_armed(vma) && > + !(vma->vm_flags & VM_SHARED)) > + goto recheck; > + } > + } > > - /* Huge page lock is still held, so page table must remain empty */ > - pml = pmd_lock(mm, pmd); > - if (ptl != pml) > - spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); > pgt_pmd = pmdp_collapse_flush(vma, haddr, pmd); > pmdp_get_lockless_sync(); > if (ptl != pml) > @@ -1648,6 +1665,8 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, > unsigned long addr, > } > if (start_pte) > pte_unmap_unlock(start_pte, ptl); > + if (pml && pml != ptl) > + spin_unlock(pml); > if (notified) > mmu_notifier_invalidate_range_end(); > drop_hpage: > -- > 2.35.3 -- Peter Xu

Re: [PATCH mm-unstable] mm/khugepaged: fix collapse_pte_mapped_thp() versus uffd

2023-08-21 Thread Peter Xu
p_read_lock()") > Signed-off-by: Hugh Dickins The locking is indeed slightly complicated.. but I didn't spot anything wrong. Acked-by: Peter Xu Thanks, -- Peter Xu

Re: [PATCH v6 03/33] pgtable: Create struct ptdesc

2023-06-27 Thread Peter Xu
+#else > + spinlock_t ptl; > +#endif > + }; > + unsigned int __page_type; > + atomic_t _refcount; > +#ifdef CONFIG_MEMCG > + unsigned long pt_memcg_data; > +#endif > +}; -- Peter Xu

Re: [PATCH 05/12] powerpc: add pte_free_defer() for pgtables sharing page

2023-06-06 Thread Peter Xu
d in pgtable_pte_page_dtor(), in Hugh's series IIUC we need the spinlock being there for the rcu section alongside the page itself. So even if to do so we'll need to also rcu call pgtable_pte_page_dtor() when needed. -- Peter Xu

Re: [PATCH 09/12] mm/khugepaged: retract_page_tables() without mmap or vma lock

2023-05-31 Thread Peter Xu
or either cpu or iommu hardwares. However OTOH, maybe it'll also be safer to just have the mmu notifiers like before (e.g., no idea whether anything can cache invalidate tlb translations from the empty pgtable)? As that doesn't seems to beat the purpose of the patchset as notifiers shouldn't fail. > > (FWIW, last I looked, there also seemed to be some other issues with > MMU notifier usage wrt IOMMUv2, see the thread > <https://lore.kernel.org/linux-mm/yzbaf9hw1%2frek...@nvidia.com/>.) > > > > + if (ptl != pml) > > + spin_unlock(ptl); > > + spin_unlock(pml); > > + > > + mm_dec_nr_ptes(mm); > > + page_table_check_pte_clear_range(mm, addr, pgt_pmd); > > + pte_free_defer(mm, pmd_pgtable(pgt_pmd)); > > } > > - i_mmap_unlock_write(mapping); > > - return target_result; > > + i_mmap_unlock_read(mapping); > > } > > > > /** > > @@ -2261,9 +2210,11 @@ static int collapse_file(struct mm_struct *mm, > > unsigned long addr, > > > > /* > > * Remove pte page tables, so we can re-fault the page as huge. > > +* If MADV_COLLAPSE, adjust result to call > > collapse_pte_mapped_thp(). > > */ > > - result = retract_page_tables(mapping, start, mm, addr, hpage, > > -cc); > > + retract_page_tables(mapping, start); > > + if (cc && !cc->is_khugepaged) > > + result = SCAN_PTE_MAPPED_HUGEPAGE; > > unlock_page(hpage); > > > > /* > > -- > > 2.35.3 > > > -- Peter Xu

Re: [PATCH 09/12] mm/khugepaged: retract_page_tables() without mmap or vma lock

2023-05-31 Thread Peter Xu
> detail in responses to you there - thanks for your patience :) Not a problem at all here! > > On Mon, 29 May 2023, Peter Xu wrote: > > On Sun, May 28, 2023 at 11:25:15PM -0700, Hugh Dickins wrote: > ... > > > @@ -1748,123 +1747,73 @@ static void > > &g

Re: [PATCH 09/12] mm/khugepaged: retract_page_tables() without mmap or vma lock

2023-05-29 Thread Peter Xu
t; + page_table_check_pte_clear_range(mm, addr, pgt_pmd); > + pte_free_defer(mm, pmd_pgtable(pgt_pmd)); > } > - i_mmap_unlock_write(mapping); > - return target_result; > + i_mmap_unlock_read(mapping); > } > > /** > @@ -2261,9 +2210,11 @@ static int collapse_file(struct mm_struct *mm, > unsigned long addr, > > /* >* Remove pte page tables, so we can re-fault the page as huge. > + * If MADV_COLLAPSE, adjust result to call collapse_pte_mapped_thp(). >*/ > - result = retract_page_tables(mapping, start, mm, addr, hpage, > - cc); > + retract_page_tables(mapping, start); > + if (cc && !cc->is_khugepaged) > + result = SCAN_PTE_MAPPED_HUGEPAGE; > unlock_page(hpage); > > /* > -- > 2.35.3 > -- Peter Xu

Re: [PATCH] mm: remove zap_page_range and create zap_vma_pages

2023-01-04 Thread Peter Xu
_range_single(). > - Remove zap_page_range. > > [1] > https://lore.kernel.org/linux-mm/20221114235507.294320-2-mike.krav...@oracle.com/ > Suggested-by: Peter Xu > Signed-off-by: Mike Kravetz Acked-by: Peter Xu -- Peter Xu

Re: [RFC PATCH] mm: remove zap_page_range and change callers to use zap_vma_page_range

2022-12-20 Thread Peter Xu
t; > [1] > https://lore.kernel.org/linux-mm/20221114235507.294320-2-mike.krav...@oracle.com/ > Suggested-by: Peter Xu > Signed-off-by: Mike Kravetz Acked-by: Peter Xu Thanks! -- Peter Xu

Re: [PATCH v4] hugetlb: simplify hugetlb handling in follow_page_mask

2022-10-30 Thread Peter Xu
migration_entry_wait_huge(pte, ptl); > + goto retry; > + } > + /* > + * hwpoisoned entry is treated as no_page_table in > + * follow_page_mask(). > + */ > + } > +out: > + spin_unlock(ptl); > + return page; > +} -- Peter Xu

Re: [PATCH v4] hugetlb: simplify hugetlb handling in follow_page_mask

2022-10-28 Thread Peter Xu
f-work on Mon & Tue, but maybe I'll still try). -- Peter Xu

Re: [PATCH v3] hugetlb: simplify hugetlb handling in follow_page_mask

2022-10-28 Thread Peter Xu
On Fri, Oct 28, 2022 at 08:27:57AM -0700, Mike Kravetz wrote: > On 10/27/22 15:34, Peter Xu wrote: > > On Wed, Oct 26, 2022 at 05:34:04PM -0700, Mike Kravetz wrote: > > > On 10/26/22 17:59, Peter Xu wrote: > > > > If we want to use the vma read lock to pro

Re: [PATCH v3] hugetlb: simplify hugetlb handling in follow_page_mask

2022-10-27 Thread Peter Xu
On Wed, Oct 26, 2022 at 05:34:04PM -0700, Mike Kravetz wrote: > On 10/26/22 17:59, Peter Xu wrote: > > Hi, Mike, > > > > On Sun, Sep 18, 2022 at 07:13:48PM -0700, Mike Kravetz wrote: > > > +struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, > >

Re: [PATCH v3] hugetlb: simplify hugetlb handling in follow_page_mask

2022-10-26 Thread Peter Xu
because the worst case is the caller will fetch a wrong page, but then it should be invalidated very soon with mmu notifiers. One thing worth mention is that pmd unshare should never free a pgtable page. IIUC it's also the same as fast-gup - afaiu we don't take the read vma lock in fast-gup too but I also think it's safe. But I hope I didn't miss something. -- Peter Xu

Re: [v2 PATCH 2/2] powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flush

2022-09-07 Thread Peter Xu
mar K.V > Signed-off-by: Yang Shi Acked-by: Peter Xu -- Peter Xu

Re: [RFC PATCH RESEND 19/28] mm: disallow do_swap_page to handle page faults under VMA lock

2022-09-06 Thread Peter Xu
On Tue, Sep 06, 2022 at 01:08:10PM -0700, Suren Baghdasaryan wrote: > On Tue, Sep 6, 2022 at 12:39 PM Peter Xu wrote: > > > > On Thu, Sep 01, 2022 at 10:35:07AM -0700, Suren Baghdasaryan wrote: > > > Due to the possibility of do_swap_page dropping mmap_lock, abort fault

Re: [RFC PATCH RESEND 19/28] mm: disallow do_swap_page to handle page faults under VMA lock

2022-09-06 Thread Peter Xu
) > vm_fault_t ret = 0; > void *shadow = NULL; > > + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { > + ret = VM_FAULT_RETRY; > + goto out; > + } > + May want to fail early similarly for handle_userfault() too for similar reason. Thanks, -- Peter Xu

Re: [PATCH v4 2/4] mm/migrate_device.c: Add missing flush_cache_page()

2022-09-02 Thread Peter Xu
diff after rebase, though.. I'm not sure how the ordering would be at last, but anyway I think this patch stands as its own too.. Acked-by: Peter Xu Thanks for tolerant with my nitpickings, > > --- > > New for v4 > --- > mm/migrate_device.c | 2 +- > 1 file changed

Re: [PATCH v4 1/4] mm/migrate_device.c: Flush TLB while holding PTL

2022-09-02 Thread Peter Xu
try > after madvise returns. Fix this by flushing the TLB while holding the > PTL. > > Signed-off-by: Alistair Popple > Reported-by: Nadav Amit > Reviewed-by: "Huang, Ying" > Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while > collecting pages") > Cc: sta...@vger.kernel.org Acked-by: Peter Xu -- Peter Xu

Re: [PATCH v3 2/3] mm/migrate_device.c: Copy pte dirty bit to page

2022-08-26 Thread Peter Xu
On Fri, Aug 26, 2022 at 06:46:02PM +0200, David Hildenbrand wrote: > On 26.08.22 17:55, Peter Xu wrote: > > On Fri, Aug 26, 2022 at 04:47:22PM +0200, David Hildenbrand wrote: > >>> To me anon exclusive only shows this mm exclusively owns this page. I > >>&

Re: [PATCH v3 2/3] mm/migrate_device.c: Copy pte dirty bit to page

2022-08-26 Thread Peter Xu
s the magic bit, we have to make sure that we won't see new > GUP pins, thus the TLB flush. > > include/linux/mm.h:gup_must_unshare() contains documentation. Hmm.. Shouldn't ptep_get_and_clear() (e.g., xchg() on x86_64) already guarantees that no other process/thread will see this pte anymore afterwards? -- Peter Xu

Re: [PATCH v3 2/3] mm/migrate_device.c: Copy pte dirty bit to page

2022-08-26 Thread Peter Xu
On Fri, Aug 26, 2022 at 11:02:58AM +1000, Alistair Popple wrote: > > Peter Xu writes: > > > On Fri, Aug 26, 2022 at 08:21:44AM +1000, Alistair Popple wrote: > >> > >> Peter Xu writes: > >> > >> > On Wed, Aug 24, 2022 at 01:03:38PM +1000,

Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page

2022-08-25 Thread Peter Xu
(or have > swap-cache allocated to it, but I'm hoping to at least get that fixed). If so I'd suggest even more straightforward document for either this trylock() or on the APIs (e.g. for migrate_vma_setup()). This behavior is IMHO hiding deep and many people may not realize. I'll comment in the comment update patch. Thanks. -- Peter Xu

Re: [PATCH v3 2/3] mm/migrate_device.c: Copy pte dirty bit to page

2022-08-25 Thread Peter Xu
On Fri, Aug 26, 2022 at 08:21:44AM +1000, Alistair Popple wrote: > > Peter Xu writes: > > > On Wed, Aug 24, 2022 at 01:03:38PM +1000, Alistair Popple wrote: > >> migrate_vma_setup() has a fast path in migrate_vma_collect_pmd() that > >> installs migratio

Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page

2022-08-25 Thread Peter Xu
be changed if explicitly did so (e.g. fork() plus mremap() for anonymous here) but I just want to make sure I get the whole point of it. Thanks, -- Peter Xu

  1   2   >