Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread Mike Rapoport
On Mon, Apr 19, 2021 at 01:21:56PM +0100, Matthew Wilcox wrote:
> On Mon, Apr 19, 2021 at 02:56:17PM +0300, Mike Rapoport wrote:
> 
> > > With that fixed, you'll have a head page that you can use for testing,
> > > which means you don't need to test PageCompound() (because you know the
> > > page isn't PageTail), you can just test PageHead().
> > 
> > I can't say I follow you here. page_is_secretmem() is intended to be a
> > generic test, so I don't see how it will get PageHead() if it is called
> > from other places.
> 
> static inline bool head_is_secretmem(struct page *head)
> {
>   if (PageHead(head))
>   return false;
>   ...
> }
> 
> static inline bool page_is_secretmem(struct page *page)
> {
>   if (PageTail(page))
>   return false;
>   return head_is_secretmem(page);
> }
> 
> (yes, calling it head is a misnomer, because it's not necessarily a head,
> it might be a base page, but until we have a name for pages which might
> be a head page or a base page, it'll have to do ...)

To me this looks less clean and readable and in the end we have an extra
check for PG_Head if that won't get optimized away.

Does not seem to me there would be a measurable difference, but if this is
important for future conversion to folio I don't mind using
{head,page}_is_secretmem.

-- 
Sincerely yours,
Mike.


Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread Matthew Wilcox
On Mon, Apr 19, 2021 at 02:56:17PM +0300, Mike Rapoport wrote:
> On Mon, Apr 19, 2021 at 12:23:02PM +0100, Matthew Wilcox wrote:
> > So you're calling page_is_secretmem() on a struct page without having
> > a refcount on it.  That is definitely not allowed.  secretmem seems to
> > be full of these kinds of races; I know this isn't the first one I've
> > seen in it.  I don't think this patchset is ready for this merge window.
> 
> There were races in the older version that did caching of large pages and
> those were fixed then, but this is anyway irrelevant because all that code
> was dropped in the latest respins.
> 
> I don't think that the fix of the race in gup_pte_range is that significant
> to wait 3 more months because of it.

I have no particular interest in secretmem, but it seems that every time
I come across it while looking at something else, I see these kinds of
major mistakes in it.  That says to me it's not ready and hasn't seen
enough review.

> > With that fixed, you'll have a head page that you can use for testing,
> > which means you don't need to test PageCompound() (because you know the
> > page isn't PageTail), you can just test PageHead().
> 
> I can't say I follow you here. page_is_secretmem() is intended to be a
> generic test, so I don't see how it will get PageHead() if it is called
> from other places.

static inline bool head_is_secretmem(struct page *head)
{
if (PageHead(head))
return false;
...
}

static inline bool page_is_secretmem(struct page *page)
{
if (PageTail(page))
return false;
return head_is_secretmem(page);
}

(yes, calling it head is a misnomer, because it's not necessarily a head,
it might be a base page, but until we have a name for pages which might
be a head page or a base page, it'll have to do ...)


Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread Mike Rapoport
On Mon, Apr 19, 2021 at 12:23:02PM +0100, Matthew Wilcox wrote:
> On Mon, Apr 19, 2021 at 11:42:18AM +0300, Mike Rapoport wrote:
> > The perf profile of the test indicated that the regression is caused by
> > page_is_secretmem() called from gup_pte_range() (inlined by gup_pgd_range):
> 
> Uhh ... you're calling it in the wrong place!
> 
> VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
> page = pte_page(pte);
> 
> if (page_is_secretmem(page))
> goto pte_unmap;
> 
> head = try_grab_compound_head(page, 1, flags);
> if (!head)
> goto pte_unmap;
> 
> So you're calling page_is_secretmem() on a struct page without having
> a refcount on it.  That is definitely not allowed.  secretmem seems to
> be full of these kinds of races; I know this isn't the first one I've
> seen in it.  I don't think this patchset is ready for this merge window.

There were races in the older version that did caching of large pages and
those were fixed then, but this is anyway irrelevant because all that code
was dropped in the latest respins.

I don't think that the fix of the race in gup_pte_range is that significant
to wait 3 more months because of it.
 
> With that fixed, you'll have a head page that you can use for testing,
> which means you don't need to test PageCompound() (because you know the
> page isn't PageTail), you can just test PageHead().

I can't say I follow you here. page_is_secretmem() is intended to be a
generic test, so I don't see how it will get PageHead() if it is called
from other places.

-- 
Sincerely yours,
Mike.


Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread Matthew Wilcox
On Mon, Apr 19, 2021 at 12:36:19PM +0300, Mike Rapoport wrote:
> Well, most if the -4.2% of the performance regression kbuild reported were
> due to repeated compount_head(page) in page_mapping(). So the whole point
> of this patch is to avoid calling page_mapping().

It's quite ludicrous how many times we call compound_head() in
page_mapping() today:

 page = compound_head(page);
 if (__builtin_expect(!!(PageSlab(page)), 0))
 if (__builtin_expect(!!(PageSwapCache(page)), 0)) {

TESTPAGEFLAG(Slab, slab, PF_NO_TAIL) expands to:

static __always_inline int PageSlab(struct page *page)
{
PF_POISONED_CHECK(compound_head(page));
return test_bit(PG_slab, _head(page));
}

static __always_inline int PageSwapCache(struct page *page)
{
page = compound_head(page);
return PageSwapBacked(page) && test_bit(PG_swapcache, >flags);
}

but then!

TESTPAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL) also expands like Slab does.

So that's six calls to compound_head(), depending what Kconfig options
you have enabled.

And folio_mapping() is one of the functions I add in the first batch of
patches, so review, etc will be helpful.


Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread Matthew Wilcox
On Mon, Apr 19, 2021 at 11:42:18AM +0300, Mike Rapoport wrote:
> The perf profile of the test indicated that the regression is caused by
> page_is_secretmem() called from gup_pte_range() (inlined by gup_pgd_range):

Uhh ... you're calling it in the wrong place!

VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
page = pte_page(pte);

if (page_is_secretmem(page))
goto pte_unmap;

head = try_grab_compound_head(page, 1, flags);
if (!head)
goto pte_unmap;

So you're calling page_is_secretmem() on a struct page without having
a refcount on it.  That is definitely not allowed.  secretmem seems to
be full of these kinds of races; I know this isn't the first one I've
seen in it.  I don't think this patchset is ready for this merge window.

With that fixed, you'll have a head page that you can use for testing,
which means you don't need to test PageCompound() (because you know the
page isn't PageTail), you can just test PageHead().


Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread David Hildenbrand

On 19.04.21 12:14, Mike Rapoport wrote:

On Mon, Apr 19, 2021 at 11:40:56AM +0200, David Hildenbrand wrote:

On 19.04.21 11:38, David Hildenbrand wrote:

On 19.04.21 11:36, Mike Rapoport wrote:

On Mon, Apr 19, 2021 at 11:15:02AM +0200, David Hildenbrand wrote:

On 19.04.21 10:42, Mike Rapoport wrote:

From: Mike Rapoport 

Kernel test robot reported -4.2% regression of will-it-scale.per_thread_ops
due to commit "mm: introduce memfd_secret system call to create "secret"
memory areas".

The perf profile of the test indicated that the regression is caused by
page_is_secretmem() called from gup_pte_range() (inlined by gup_pgd_range):

 27.76  +2.5  30.23   perf-profile.children.cycles-pp.gup_pgd_range
  0.00  +3.2   3.19 ± 2%  perf-profile.children.cycles-pp.page_mapping
  0.00  +3.7   3.66 ± 2%  perf-profile.children.cycles-pp.page_is_secretmem

Further analysis showed that the slow down happens because neither
page_is_secretmem() nor page_mapping() are not inline and moreover,
multiple page flags checks in page_mapping() involve calling
compound_head() several times for the same page.

Make page_is_secretmem() inline and replace page_mapping() with page flag
checks that do not imply page-to-head conversion.

Reported-by: kernel test robot 
Signed-off-by: Mike Rapoport 
---

@Andrew,
The patch is vs v5.12-rc7-mmots-2021-04-15-16-28, I'd appreciate if it would
be added as a fixup to the memfd_secret series.

 include/linux/secretmem.h | 26 +-
 mm/secretmem.c| 12 +---
 2 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
index 907a6734059c..b842b38cbeb1 100644
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -4,8 +4,32 @@
 #ifdef CONFIG_SECRETMEM
+extern const struct address_space_operations secretmem_aops;
+
+static inline bool page_is_secretmem(struct page *page)
+{
+   struct address_space *mapping;
+
+   /*
+* Using page_mapping() is quite slow because of the actual call
+* instruction and repeated compound_head(page) inside the
+* page_mapping() function.
+* We know that secretmem pages are not compound and LRU so we can
+* save a couple of cycles here.
+*/
+   if (PageCompound(page) || !PageLRU(page))
+   return false;


I'd assume secretmem pages are rare in basically every setup out there. So
maybe throwing in a couple of likely()/unlikely() might make sense.


I'd say we could do unlikely(page_is_secretmem()) at call sites. Here I can
hardly estimate which pages are going to be checked.

+
+   mapping = (struct address_space *)
+   ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);
+


Not sure if open-coding page_mapping is really a good idea here -- or even
necessary after the fast path above is in place. Anyhow, just my 2 cents.


Well, most if the -4.2% of the performance regression kbuild reported were
due to repeated compount_head(page) in page_mapping(). So the whole point
of this patch is to avoid calling page_mapping().


I would have thought the fast path "(PageCompound(page) ||
!PageLRU(page))" would already avoid calling page_mapping() in many cases.


(and I do wonder if a generic page_mapping() optimization would make sense
instead)


Not sure. Replacing page_mapping() with page_file_mapping() at the
call sites at fs/ and mm/ increased the defconfig image by nearly 2k
and page_file_mapping() is way simpler than page_mapping()

add/remove: 1/0 grow/shrink: 35/0 up/down: 1960/0 (1960)
Function old new   delta
shrink_page_list34143670+256
__set_page_dirty_nobuffers   242 349+107
check_move_unevictable_pages 904 987 +83
move_to_new_page 591 671 +80
shrink_active_list   912 970 +58
move_pages_to_lru911 965 +54
migrate_pages   25002554 +54
shmem_swapin_page   11451197 +52
shmem_undo_range16691719 +50
__test_set_page_writeback620 670 +50
__set_page_dirty_buffers 187 237 +50
__pagevec_lru_add757 807 +50
__munlock_pagevec   11551205 +50
__dump_page 11011151 +50
__cancel_dirty_page  182 232 +50
__remove_mapping 461 510 +49
rmap_walk_file   402 449 +47
isolate_movable_page 240 287 +47
test_clear_page_writeback668 714 +46
page_cache_pipe_buf_try_steal171 217 +46
page_endio 

Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread Mike Rapoport
On Mon, Apr 19, 2021 at 11:40:56AM +0200, David Hildenbrand wrote:
> On 19.04.21 11:38, David Hildenbrand wrote:
> > On 19.04.21 11:36, Mike Rapoport wrote:
> > > On Mon, Apr 19, 2021 at 11:15:02AM +0200, David Hildenbrand wrote:
> > > > On 19.04.21 10:42, Mike Rapoport wrote:
> > > > > From: Mike Rapoport 
> > > > > 
> > > > > Kernel test robot reported -4.2% regression of 
> > > > > will-it-scale.per_thread_ops
> > > > > due to commit "mm: introduce memfd_secret system call to create 
> > > > > "secret"
> > > > > memory areas".
> > > > > 
> > > > > The perf profile of the test indicated that the regression is caused 
> > > > > by
> > > > > page_is_secretmem() called from gup_pte_range() (inlined by 
> > > > > gup_pgd_range):
> > > > > 
> > > > > 27.76  +2.5  30.23   
> > > > > perf-profile.children.cycles-pp.gup_pgd_range
> > > > >  0.00  +3.2   3.19 ± 2%  
> > > > > perf-profile.children.cycles-pp.page_mapping
> > > > >  0.00  +3.7   3.66 ± 2%  
> > > > > perf-profile.children.cycles-pp.page_is_secretmem
> > > > > 
> > > > > Further analysis showed that the slow down happens because neither
> > > > > page_is_secretmem() nor page_mapping() are not inline and moreover,
> > > > > multiple page flags checks in page_mapping() involve calling
> > > > > compound_head() several times for the same page.
> > > > > 
> > > > > Make page_is_secretmem() inline and replace page_mapping() with page 
> > > > > flag
> > > > > checks that do not imply page-to-head conversion.
> > > > > 
> > > > > Reported-by: kernel test robot 
> > > > > Signed-off-by: Mike Rapoport 
> > > > > ---
> > > > > 
> > > > > @Andrew,
> > > > > The patch is vs v5.12-rc7-mmots-2021-04-15-16-28, I'd appreciate if 
> > > > > it would
> > > > > be added as a fixup to the memfd_secret series.
> > > > > 
> > > > > include/linux/secretmem.h | 26 +-
> > > > > mm/secretmem.c| 12 +---
> > > > > 2 files changed, 26 insertions(+), 12 deletions(-)
> > > > > 
> > > > > diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
> > > > > index 907a6734059c..b842b38cbeb1 100644
> > > > > --- a/include/linux/secretmem.h
> > > > > +++ b/include/linux/secretmem.h
> > > > > @@ -4,8 +4,32 @@
> > > > > #ifdef CONFIG_SECRETMEM
> > > > > +extern const struct address_space_operations secretmem_aops;
> > > > > +
> > > > > +static inline bool page_is_secretmem(struct page *page)
> > > > > +{
> > > > > + struct address_space *mapping;
> > > > > +
> > > > > + /*
> > > > > +  * Using page_mapping() is quite slow because of the actual call
> > > > > +  * instruction and repeated compound_head(page) inside the
> > > > > +  * page_mapping() function.
> > > > > +  * We know that secretmem pages are not compound and LRU so we 
> > > > > can
> > > > > +  * save a couple of cycles here.
> > > > > +  */
> > > > > + if (PageCompound(page) || !PageLRU(page))
> > > > > + return false;
> > > > 
> > > > I'd assume secretmem pages are rare in basically every setup out there. 
> > > > So
> > > > maybe throwing in a couple of likely()/unlikely() might make sense.
> > > 
> > > I'd say we could do unlikely(page_is_secretmem()) at call sites. Here I 
> > > can
> > > hardly estimate which pages are going to be checked.
> > > > > +
> > > > > + mapping = (struct address_space *)
> > > > > + ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);
> > > > > +
> > > > 
> > > > Not sure if open-coding page_mapping is really a good idea here -- or 
> > > > even
> > > > necessary after the fast path above is in place. Anyhow, just my 2 
> > > > cents.
> > > 
> > > Well, most if the -4.2% of the performance regression kbuild reported were
> > > due to repeated compount_head(page) in page_mapping(). So the whole point
> > > of this patch is to avoid calling page_mapping().
> > 
> > I would have thought the fast path "(PageCompound(page) ||
> > !PageLRU(page))" would already avoid calling page_mapping() in many cases.
> 
> (and I do wonder if a generic page_mapping() optimization would make sense
> instead)

Not sure. Replacing page_mapping() with page_file_mapping() at the
call sites at fs/ and mm/ increased the defconfig image by nearly 2k
and page_file_mapping() is way simpler than page_mapping()

add/remove: 1/0 grow/shrink: 35/0 up/down: 1960/0 (1960)
Function old new   delta
shrink_page_list34143670+256
__set_page_dirty_nobuffers   242 349+107
check_move_unevictable_pages 904 987 +83
move_to_new_page 591 671 +80
shrink_active_list   912 970 +58
move_pages_to_lru911 965 +54
migrate_pages   25002554 +54
shmem_swapin_page   11451197 +52
shmem_undo_range 

Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread David Hildenbrand

On 19.04.21 11:38, David Hildenbrand wrote:

On 19.04.21 11:36, Mike Rapoport wrote:

On Mon, Apr 19, 2021 at 11:15:02AM +0200, David Hildenbrand wrote:

On 19.04.21 10:42, Mike Rapoport wrote:

From: Mike Rapoport 

Kernel test robot reported -4.2% regression of will-it-scale.per_thread_ops
due to commit "mm: introduce memfd_secret system call to create "secret"
memory areas".

The perf profile of the test indicated that the regression is caused by
page_is_secretmem() called from gup_pte_range() (inlined by gup_pgd_range):

27.76  +2.5  30.23   perf-profile.children.cycles-pp.gup_pgd_range
 0.00  +3.2   3.19 ± 2%  perf-profile.children.cycles-pp.page_mapping
 0.00  +3.7   3.66 ± 2%  perf-profile.children.cycles-pp.page_is_secretmem

Further analysis showed that the slow down happens because neither
page_is_secretmem() nor page_mapping() are not inline and moreover,
multiple page flags checks in page_mapping() involve calling
compound_head() several times for the same page.

Make page_is_secretmem() inline and replace page_mapping() with page flag
checks that do not imply page-to-head conversion.

Reported-by: kernel test robot 
Signed-off-by: Mike Rapoport 
---

@Andrew,
The patch is vs v5.12-rc7-mmots-2021-04-15-16-28, I'd appreciate if it would
be added as a fixup to the memfd_secret series.

include/linux/secretmem.h | 26 +-
mm/secretmem.c| 12 +---
2 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
index 907a6734059c..b842b38cbeb1 100644
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -4,8 +4,32 @@
#ifdef CONFIG_SECRETMEM
+extern const struct address_space_operations secretmem_aops;
+
+static inline bool page_is_secretmem(struct page *page)
+{
+   struct address_space *mapping;
+
+   /*
+* Using page_mapping() is quite slow because of the actual call
+* instruction and repeated compound_head(page) inside the
+* page_mapping() function.
+* We know that secretmem pages are not compound and LRU so we can
+* save a couple of cycles here.
+*/
+   if (PageCompound(page) || !PageLRU(page))
+   return false;


I'd assume secretmem pages are rare in basically every setup out there. So
maybe throwing in a couple of likely()/unlikely() might make sense.


I'd say we could do unlikely(page_is_secretmem()) at call sites. Here I can
hardly estimate which pages are going to be checked.
   

+
+   mapping = (struct address_space *)
+   ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);
+


Not sure if open-coding page_mapping is really a good idea here -- or even
necessary after the fast path above is in place. Anyhow, just my 2 cents.


Well, most if the -4.2% of the performance regression kbuild reported were
due to repeated compount_head(page) in page_mapping(). So the whole point
of this patch is to avoid calling page_mapping().


I would have thought the fast path "(PageCompound(page) ||
!PageLRU(page))" would already avoid calling page_mapping() in many cases.


(and I do wonder if a generic page_mapping() optimization would make 
sense instead)


Willy can most probably give the best advise here :)

--
Thanks,

David / dhildenb



Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread David Hildenbrand

On 19.04.21 11:36, Mike Rapoport wrote:

On Mon, Apr 19, 2021 at 11:15:02AM +0200, David Hildenbrand wrote:

On 19.04.21 10:42, Mike Rapoport wrote:

From: Mike Rapoport 

Kernel test robot reported -4.2% regression of will-it-scale.per_thread_ops
due to commit "mm: introduce memfd_secret system call to create "secret"
memory areas".

The perf profile of the test indicated that the regression is caused by
page_is_secretmem() called from gup_pte_range() (inlined by gup_pgd_range):

   27.76  +2.5  30.23   perf-profile.children.cycles-pp.gup_pgd_range
0.00  +3.2   3.19 ± 2%  perf-profile.children.cycles-pp.page_mapping
0.00  +3.7   3.66 ± 2%  perf-profile.children.cycles-pp.page_is_secretmem

Further analysis showed that the slow down happens because neither
page_is_secretmem() nor page_mapping() are not inline and moreover,
multiple page flags checks in page_mapping() involve calling
compound_head() several times for the same page.

Make page_is_secretmem() inline and replace page_mapping() with page flag
checks that do not imply page-to-head conversion.

Reported-by: kernel test robot 
Signed-off-by: Mike Rapoport 
---

@Andrew,
The patch is vs v5.12-rc7-mmots-2021-04-15-16-28, I'd appreciate if it would
be added as a fixup to the memfd_secret series.

   include/linux/secretmem.h | 26 +-
   mm/secretmem.c| 12 +---
   2 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
index 907a6734059c..b842b38cbeb1 100644
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -4,8 +4,32 @@
   #ifdef CONFIG_SECRETMEM
+extern const struct address_space_operations secretmem_aops;
+
+static inline bool page_is_secretmem(struct page *page)
+{
+   struct address_space *mapping;
+
+   /*
+* Using page_mapping() is quite slow because of the actual call
+* instruction and repeated compound_head(page) inside the
+* page_mapping() function.
+* We know that secretmem pages are not compound and LRU so we can
+* save a couple of cycles here.
+*/
+   if (PageCompound(page) || !PageLRU(page))
+   return false;


I'd assume secretmem pages are rare in basically every setup out there. So
maybe throwing in a couple of likely()/unlikely() might make sense.


I'd say we could do unlikely(page_is_secretmem()) at call sites. Here I can
hardly estimate which pages are going to be checked.
  

+
+   mapping = (struct address_space *)
+   ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);
+


Not sure if open-coding page_mapping is really a good idea here -- or even
necessary after the fast path above is in place. Anyhow, just my 2 cents.


Well, most if the -4.2% of the performance regression kbuild reported were
due to repeated compount_head(page) in page_mapping(). So the whole point
of this patch is to avoid calling page_mapping().


I would have thought the fast path "(PageCompound(page) || 
!PageLRU(page))" would already avoid calling page_mapping() in many cases.



--
Thanks,

David / dhildenb



Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread Mike Rapoport
On Mon, Apr 19, 2021 at 11:15:02AM +0200, David Hildenbrand wrote:
> On 19.04.21 10:42, Mike Rapoport wrote:
> > From: Mike Rapoport 
> > 
> > Kernel test robot reported -4.2% regression of will-it-scale.per_thread_ops
> > due to commit "mm: introduce memfd_secret system call to create "secret"
> > memory areas".
> > 
> > The perf profile of the test indicated that the regression is caused by
> > page_is_secretmem() called from gup_pte_range() (inlined by gup_pgd_range):
> > 
> >   27.76  +2.5  30.23   perf-profile.children.cycles-pp.gup_pgd_range
> >0.00  +3.2   3.19 ± 2%  perf-profile.children.cycles-pp.page_mapping
> >0.00  +3.7   3.66 ± 2%  perf-profile.children.cycles-pp.page_is_secretmem
> > 
> > Further analysis showed that the slow down happens because neither
> > page_is_secretmem() nor page_mapping() are not inline and moreover,
> > multiple page flags checks in page_mapping() involve calling
> > compound_head() several times for the same page.
> > 
> > Make page_is_secretmem() inline and replace page_mapping() with page flag
> > checks that do not imply page-to-head conversion.
> > 
> > Reported-by: kernel test robot 
> > Signed-off-by: Mike Rapoport 
> > ---
> > 
> > @Andrew,
> > The patch is vs v5.12-rc7-mmots-2021-04-15-16-28, I'd appreciate if it would
> > be added as a fixup to the memfd_secret series.
> > 
> >   include/linux/secretmem.h | 26 +-
> >   mm/secretmem.c| 12 +---
> >   2 files changed, 26 insertions(+), 12 deletions(-)
> > 
> > diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
> > index 907a6734059c..b842b38cbeb1 100644
> > --- a/include/linux/secretmem.h
> > +++ b/include/linux/secretmem.h
> > @@ -4,8 +4,32 @@
> >   #ifdef CONFIG_SECRETMEM
> > +extern const struct address_space_operations secretmem_aops;
> > +
> > +static inline bool page_is_secretmem(struct page *page)
> > +{
> > +   struct address_space *mapping;
> > +
> > +   /*
> > +* Using page_mapping() is quite slow because of the actual call
> > +* instruction and repeated compound_head(page) inside the
> > +* page_mapping() function.
> > +* We know that secretmem pages are not compound and LRU so we can
> > +* save a couple of cycles here.
> > +*/
> > +   if (PageCompound(page) || !PageLRU(page))
> > +   return false;
> 
> I'd assume secretmem pages are rare in basically every setup out there. So
> maybe throwing in a couple of likely()/unlikely() might make sense.

I'd say we could do unlikely(page_is_secretmem()) at call sites. Here I can
hardly estimate which pages are going to be checked.
 
> > +
> > +   mapping = (struct address_space *)
> > +   ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);
> > +
> 
> Not sure if open-coding page_mapping is really a good idea here -- or even
> necessary after the fast path above is in place. Anyhow, just my 2 cents.

Well, most if the -4.2% of the performance regression kbuild reported were
due to repeated compount_head(page) in page_mapping(). So the whole point
of this patch is to avoid calling page_mapping().

> The idea of the patch makes sense to me.

-- 
Sincerely yours,
Mike.


Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread David Hildenbrand

On 19.04.21 10:42, Mike Rapoport wrote:

From: Mike Rapoport 

Kernel test robot reported -4.2% regression of will-it-scale.per_thread_ops
due to commit "mm: introduce memfd_secret system call to create "secret"
memory areas".

The perf profile of the test indicated that the regression is caused by
page_is_secretmem() called from gup_pte_range() (inlined by gup_pgd_range):

  27.76  +2.5  30.23   perf-profile.children.cycles-pp.gup_pgd_range
   0.00  +3.2   3.19 ± 2%  perf-profile.children.cycles-pp.page_mapping
   0.00  +3.7   3.66 ± 2%  perf-profile.children.cycles-pp.page_is_secretmem

Further analysis showed that the slow down happens because neither
page_is_secretmem() nor page_mapping() are not inline and moreover,
multiple page flags checks in page_mapping() involve calling
compound_head() several times for the same page.

Make page_is_secretmem() inline and replace page_mapping() with page flag
checks that do not imply page-to-head conversion.

Reported-by: kernel test robot 
Signed-off-by: Mike Rapoport 
---

@Andrew,
The patch is vs v5.12-rc7-mmots-2021-04-15-16-28, I'd appreciate if it would
be added as a fixup to the memfd_secret series.

  include/linux/secretmem.h | 26 +-
  mm/secretmem.c| 12 +---
  2 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
index 907a6734059c..b842b38cbeb1 100644
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -4,8 +4,32 @@
  
  #ifdef CONFIG_SECRETMEM
  
+extern const struct address_space_operations secretmem_aops;

+
+static inline bool page_is_secretmem(struct page *page)
+{
+   struct address_space *mapping;
+
+   /*
+* Using page_mapping() is quite slow because of the actual call
+* instruction and repeated compound_head(page) inside the
+* page_mapping() function.
+* We know that secretmem pages are not compound and LRU so we can
+* save a couple of cycles here.
+*/
+   if (PageCompound(page) || !PageLRU(page))
+   return false;


I'd assume secretmem pages are rare in basically every setup out there. 
So maybe throwing in a couple of likely()/unlikely() might make sense.



+
+   mapping = (struct address_space *)
+   ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);
+


Not sure if open-coding page_mapping is really a good idea here -- or 
even necessary after the fast path above is in place. Anyhow, just my 2 
cents.


The idea of the patch makes sense to me.

--
Thanks,

David / dhildenb