On Tue, Apr 10, 2018 at 05:05:28AM -0700, Matthew Wilcox wrote:
> On Tue, Apr 10, 2018 at 10:26:43AM +0200, Michal Hocko wrote:
> > On Mon 09-04-18 12:40:44, Matthew Wilcox wrote:
> > > The problem is that the mapping gfp flags are used not only for allocating
> > > pages, but also for allocating the page cache data structures that hold
> > > the pages.  F2FS is the only filesystem that set the __GFP_ZERO bit,
> > > so it's the first time anyone's noticed that the page cache passes the
> > > __GFP_ZERO bit through to the radix tree allocation routines, which
> > > causes the radix tree nodes to be zeroed instead of constructed.
> > > 
> > > I think the right solution to this is:
> > 
> > This just hides the underlying problem that the node is not fully and
> > properly initialized. Relying on the previous released state is just too
> > subtle.
> That's the fundamental design of slab-with-constructors.  The user provides
> a constructor, so all newly allocagted objects are initialised to a known
> state, then the user will restore the object to that state when it frees
> the object to slab.
> > Are you going to blacklist all potential gfp flags that come
> > from the mapping? This is just unmaintainable! If anything this should
> > be an explicit & with the allowed set of allowed flags.
> Oh, I agree that using the set of flags used to allocate the page
> in order to allocate the radix tree nodes is a pretty horrible idea.
> Your suggestion, then, is:
> -     error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
> +     error = radix_tree_preload(gfp_mask & GFP_RECLAIM_MASK);
> correct?

Looks much better.

Finally, it seems everyone agree on this. However, I won't include
warning part of slab allocator because I think it's improve stuff
not bug fix so it could be separted.
If anyone really want to include it in this stable patch,
please discuss with slub maintainers before.

Thanks for the reivew, Matthew, Michal, Jan and Johannes.

>From 652bb75124896fa040df78b98496a354f54fc524 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minc...@kernel.org>
Date: Tue, 10 Apr 2018 22:13:50 +0900
Subject: [PATCH v4] mm: workingset: fix NULL ptr dereference

GFP mask passed to page cache functions (often coming from
mapping->gfp_mask) is used both for allocation of page cache page and for
allocation of radix tree metadata necessary to add the page to the page
cache. When the mask contains __GFP_ZERO (as is the case for some f2fs
metadata mappings), this breaks radix tree code as that code expects
allocated radix tree nodes to be properly initialized by the slab
constructor and not zeroed. In particular node->private_list is failing
list_empty() check and the following list operation in
workingset_update_node() will dereference NULL.

Fix the problem by removing non-reclimable flags by GFP_RECLAIM_MASK
for radix tree allocations.

Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check")
Cc: Johannes Weiner <han...@cmpxchg.org>
Cc: Jan Kara <j...@suse.cz>
Cc: Jaegeuk Kim <jaeg...@kernel.org>
Cc: Chao Yu <yuch...@huawei.com>
Cc: Christopher Lameter <c...@linux.com>
Cc: linux-fsde...@vger.kernel.org
Cc: sta...@vger.kernel.org
Suggested-by: Matthew Wilcox <wi...@infradead.org>
Reported-by: Chris Fries <cfr...@google.com>
Signed-off-by: Minchan Kim <minc...@kernel.org>
 mm/filemap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index ab77e19ab09c..5f3311edfea4 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -786,7 +786,7 @@ int replace_page_cache_page(struct page *old, struct page 
*new, gfp_t gfp_mask)
        VM_BUG_ON_PAGE(!PageLocked(new), new);
        VM_BUG_ON_PAGE(new->mapping, new);
-       error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
+       error = radix_tree_preload(gfp_mask & GFP_RECLAIM_MASK);
        if (!error) {
                struct address_space *mapping = old->mapping;
                void (*freepage)(struct page *);
@@ -842,7 +842,7 @@ static int __add_to_page_cache_locked(struct page *page,
                        return error;
-       error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM);
+       error = radix_tree_maybe_preload(gfp_mask & GFP_RECLAIM_MASK);
        if (error) {
                if (!huge)
                        mem_cgroup_cancel_charge(page, memcg, false);

Reply via email to