Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On Thu, Apr 20, 2017 at 04:44:31PM +0200, Jan Kara wrote: > On Thu 20-04-17 16:35:10, Jan Kara wrote: > > On Wed 19-04-17 13:28:36, Ross Zwisler wrote: > > > On Wed, Apr 19, 2017 at 06:11:31PM +0300, Andrey Ryabinin wrote: > > > > On 04/18/2017 10:38 PM, Ross Zwisler wrote: > > > > > On Fri, Apr 14, 2017 at 05:07:50PM +0300, Andrey Ryabinin wrote: > > > > >> Some direct write fs hooks call invalidate_inode_pages2[_range]() > > > > >> conditionally iff mapping->nrpages is not zero. If page cache is > > > > >> empty, > > > > >> buffered read following after direct IO write would get stale data > > > > >> from > > > > >> the cleancache. > > > > >> > > > > >> Also it doesn't feel right to check only for ->nrpages because > > > > >> invalidate_inode_pages2[_range] invalidates exceptional entries as > > > > >> well. > > > > >> > > > > >> Fix this by calling invalidate_inode_pages2[_range]() regardless of > > > > >> nrpages > > > > >> state. > > > > >> > > > > >> Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") > > > > >> Signed-off-by: Andrey Ryabinin> > > > >> --- > > > > > <> > > > > >> diff --git a/fs/dax.c b/fs/dax.c > > > > >> index 2e382fe..1e8cca0 100644 > > > > >> --- a/fs/dax.c > > > > >> +++ b/fs/dax.c > > > > >> @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t > > > > >> pos, loff_t length, void *data, > > > > >> * into page tables. We have to tear down these mappings so > > > > >> that data > > > > >> * written by write(2) is visible in mmap. > > > > >> */ > > > > >> -if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { > > > > >> +if ((iomap->flags & IOMAP_F_NEW)) { > > > > >> invalidate_inode_pages2_range(inode->i_mapping, > > > > >>pos >> PAGE_SHIFT, > > > > >>(end - 1) >> PAGE_SHIFT); > > > > > > > > > > tl;dr: I think the old code is correct, and that you don't need this > > > > > change. > > > > > > > > > > This should be harmless, but could slow us down a little if we keep > > > > > calling invalidate_inode_pages2_range() without really needing to. > > > > > Really for > > > > > DAX I think we need to call invalidate_inode_page2_range() only if we > > > > > have > > > > > zero pages mapped over the place where we are doing I/O, which is why > > > > > we check > > > > > nrpages. > > > > > > > > > > > > > Check for ->nrpages only looks strange, because > > > > invalidate_inode_pages2_range() also > > > > invalidates exceptional radix tree entries. Is that correct that we > > > > invalidate > > > > exceptional entries only if ->nrpages > 0 and skip invalidation > > > > otherwise? > > > > > > For DAX we only invalidate clean DAX exceptional entries so that we can > > > keep > > > dirty entries around for writeback, but yes you're correct that we only > > > do the > > > invalidation if nrpages > 0. And yes, it does seem a bit weird. :) > > > > Actually in this place the nrpages check is deliberate since there should > > only be hole pages or nothing in the invalidated range - see the comment > > before the if. But thinking more about it this assumption actually is not > > right in presence of zero PMD entries in the radix tree. So this change > > actually also fixes a possible bug for DAX but we should do it as a > > separate patch with a proper changelog. > > Something like the attached patch. Ross? Yep, great catch, this is a real issue. The attached patch isn't sufficient, though, because invalidate_inode_pages2_range() for DAX exceptional entries only wipes out the radix tree entry, and doesn't call unmap_mapping_range() as it does in the case of real pages. I'm working on a fix and an associated xfstest test.
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On Thu, Apr 20, 2017 at 04:44:31PM +0200, Jan Kara wrote: > On Thu 20-04-17 16:35:10, Jan Kara wrote: > > On Wed 19-04-17 13:28:36, Ross Zwisler wrote: > > > On Wed, Apr 19, 2017 at 06:11:31PM +0300, Andrey Ryabinin wrote: > > > > On 04/18/2017 10:38 PM, Ross Zwisler wrote: > > > > > On Fri, Apr 14, 2017 at 05:07:50PM +0300, Andrey Ryabinin wrote: > > > > >> Some direct write fs hooks call invalidate_inode_pages2[_range]() > > > > >> conditionally iff mapping->nrpages is not zero. If page cache is > > > > >> empty, > > > > >> buffered read following after direct IO write would get stale data > > > > >> from > > > > >> the cleancache. > > > > >> > > > > >> Also it doesn't feel right to check only for ->nrpages because > > > > >> invalidate_inode_pages2[_range] invalidates exceptional entries as > > > > >> well. > > > > >> > > > > >> Fix this by calling invalidate_inode_pages2[_range]() regardless of > > > > >> nrpages > > > > >> state. > > > > >> > > > > >> Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") > > > > >> Signed-off-by: Andrey Ryabinin > > > > >> --- > > > > > <> > > > > >> diff --git a/fs/dax.c b/fs/dax.c > > > > >> index 2e382fe..1e8cca0 100644 > > > > >> --- a/fs/dax.c > > > > >> +++ b/fs/dax.c > > > > >> @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t > > > > >> pos, loff_t length, void *data, > > > > >> * into page tables. We have to tear down these mappings so > > > > >> that data > > > > >> * written by write(2) is visible in mmap. > > > > >> */ > > > > >> -if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { > > > > >> +if ((iomap->flags & IOMAP_F_NEW)) { > > > > >> invalidate_inode_pages2_range(inode->i_mapping, > > > > >>pos >> PAGE_SHIFT, > > > > >>(end - 1) >> PAGE_SHIFT); > > > > > > > > > > tl;dr: I think the old code is correct, and that you don't need this > > > > > change. > > > > > > > > > > This should be harmless, but could slow us down a little if we keep > > > > > calling invalidate_inode_pages2_range() without really needing to. > > > > > Really for > > > > > DAX I think we need to call invalidate_inode_page2_range() only if we > > > > > have > > > > > zero pages mapped over the place where we are doing I/O, which is why > > > > > we check > > > > > nrpages. > > > > > > > > > > > > > Check for ->nrpages only looks strange, because > > > > invalidate_inode_pages2_range() also > > > > invalidates exceptional radix tree entries. Is that correct that we > > > > invalidate > > > > exceptional entries only if ->nrpages > 0 and skip invalidation > > > > otherwise? > > > > > > For DAX we only invalidate clean DAX exceptional entries so that we can > > > keep > > > dirty entries around for writeback, but yes you're correct that we only > > > do the > > > invalidation if nrpages > 0. And yes, it does seem a bit weird. :) > > > > Actually in this place the nrpages check is deliberate since there should > > only be hole pages or nothing in the invalidated range - see the comment > > before the if. But thinking more about it this assumption actually is not > > right in presence of zero PMD entries in the radix tree. So this change > > actually also fixes a possible bug for DAX but we should do it as a > > separate patch with a proper changelog. > > Something like the attached patch. Ross? Yep, great catch, this is a real issue. The attached patch isn't sufficient, though, because invalidate_inode_pages2_range() for DAX exceptional entries only wipes out the radix tree entry, and doesn't call unmap_mapping_range() as it does in the case of real pages. I'm working on a fix and an associated xfstest test.
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On Thu 20-04-17 16:35:10, Jan Kara wrote: > On Wed 19-04-17 13:28:36, Ross Zwisler wrote: > > On Wed, Apr 19, 2017 at 06:11:31PM +0300, Andrey Ryabinin wrote: > > > On 04/18/2017 10:38 PM, Ross Zwisler wrote: > > > > On Fri, Apr 14, 2017 at 05:07:50PM +0300, Andrey Ryabinin wrote: > > > >> Some direct write fs hooks call invalidate_inode_pages2[_range]() > > > >> conditionally iff mapping->nrpages is not zero. If page cache is empty, > > > >> buffered read following after direct IO write would get stale data from > > > >> the cleancache. > > > >> > > > >> Also it doesn't feel right to check only for ->nrpages because > > > >> invalidate_inode_pages2[_range] invalidates exceptional entries as > > > >> well. > > > >> > > > >> Fix this by calling invalidate_inode_pages2[_range]() regardless of > > > >> nrpages > > > >> state. > > > >> > > > >> Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") > > > >> Signed-off-by: Andrey Ryabinin> > > >> --- > > > > <> > > > >> diff --git a/fs/dax.c b/fs/dax.c > > > >> index 2e382fe..1e8cca0 100644 > > > >> --- a/fs/dax.c > > > >> +++ b/fs/dax.c > > > >> @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, > > > >> loff_t length, void *data, > > > >> * into page tables. We have to tear down these mappings so > > > >> that data > > > >> * written by write(2) is visible in mmap. > > > >> */ > > > >> - if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { > > > >> + if ((iomap->flags & IOMAP_F_NEW)) { > > > >>invalidate_inode_pages2_range(inode->i_mapping, > > > >> pos >> PAGE_SHIFT, > > > >> (end - 1) >> PAGE_SHIFT); > > > > > > > > tl;dr: I think the old code is correct, and that you don't need this > > > > change. > > > > > > > > This should be harmless, but could slow us down a little if we keep > > > > calling invalidate_inode_pages2_range() without really needing to. > > > > Really for > > > > DAX I think we need to call invalidate_inode_page2_range() only if we > > > > have > > > > zero pages mapped over the place where we are doing I/O, which is why > > > > we check > > > > nrpages. > > > > > > > > > > Check for ->nrpages only looks strange, because > > > invalidate_inode_pages2_range() also > > > invalidates exceptional radix tree entries. Is that correct that we > > > invalidate > > > exceptional entries only if ->nrpages > 0 and skip invalidation otherwise? > > > > For DAX we only invalidate clean DAX exceptional entries so that we can keep > > dirty entries around for writeback, but yes you're correct that we only do > > the > > invalidation if nrpages > 0. And yes, it does seem a bit weird. :) > > Actually in this place the nrpages check is deliberate since there should > only be hole pages or nothing in the invalidated range - see the comment > before the if. But thinking more about it this assumption actually is not > right in presence of zero PMD entries in the radix tree. So this change > actually also fixes a possible bug for DAX but we should do it as a > separate patch with a proper changelog. Something like the attached patch. Ross? Honza -- Jan Kara SUSE Labs, CR >From da79b4b72a6fe5fcf1a554ca1ce77cb462e8a306 Mon Sep 17 00:00:00 2001 From: Jan Kara Date: Thu, 20 Apr 2017 16:38:20 +0200 Subject: [PATCH] dax: Fix inconsistency between mmap and write(2) When a process has a PMD-sized hole mapped via mmap and later allocates part of the file underlying this area using write(2), memory mappings need not be appropriately invalidated if the file has no hole pages allocated and thus view via mmap will not show the data written by write(2). Fix the problem by always invalidating memory mappings covering part of the file for which blocks got allocated. Signed-off-by: Jan Kara --- fs/dax.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 85abd741253d..da7bc44e5725 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1028,11 +1028,11 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, return -EIO; /* - * Write can allocate block for an area which has a hole page mapped - * into page tables. We have to tear down these mappings so that data - * written by write(2) is visible in mmap. + * Write can allocate block for an area which has a hole page or zero + * PMD entry in the radix tree. We have to tear down these mappings so + * that data written by write(2) is visible in mmap. */ - if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { + if (iomap->flags & IOMAP_F_NEW) { invalidate_inode_pages2_range(inode->i_mapping, pos >> PAGE_SHIFT, (end - 1) >> PAGE_SHIFT); -- 2.12.0
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On Thu 20-04-17 16:35:10, Jan Kara wrote: > On Wed 19-04-17 13:28:36, Ross Zwisler wrote: > > On Wed, Apr 19, 2017 at 06:11:31PM +0300, Andrey Ryabinin wrote: > > > On 04/18/2017 10:38 PM, Ross Zwisler wrote: > > > > On Fri, Apr 14, 2017 at 05:07:50PM +0300, Andrey Ryabinin wrote: > > > >> Some direct write fs hooks call invalidate_inode_pages2[_range]() > > > >> conditionally iff mapping->nrpages is not zero. If page cache is empty, > > > >> buffered read following after direct IO write would get stale data from > > > >> the cleancache. > > > >> > > > >> Also it doesn't feel right to check only for ->nrpages because > > > >> invalidate_inode_pages2[_range] invalidates exceptional entries as > > > >> well. > > > >> > > > >> Fix this by calling invalidate_inode_pages2[_range]() regardless of > > > >> nrpages > > > >> state. > > > >> > > > >> Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") > > > >> Signed-off-by: Andrey Ryabinin > > > >> --- > > > > <> > > > >> diff --git a/fs/dax.c b/fs/dax.c > > > >> index 2e382fe..1e8cca0 100644 > > > >> --- a/fs/dax.c > > > >> +++ b/fs/dax.c > > > >> @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, > > > >> loff_t length, void *data, > > > >> * into page tables. We have to tear down these mappings so > > > >> that data > > > >> * written by write(2) is visible in mmap. > > > >> */ > > > >> - if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { > > > >> + if ((iomap->flags & IOMAP_F_NEW)) { > > > >>invalidate_inode_pages2_range(inode->i_mapping, > > > >> pos >> PAGE_SHIFT, > > > >> (end - 1) >> PAGE_SHIFT); > > > > > > > > tl;dr: I think the old code is correct, and that you don't need this > > > > change. > > > > > > > > This should be harmless, but could slow us down a little if we keep > > > > calling invalidate_inode_pages2_range() without really needing to. > > > > Really for > > > > DAX I think we need to call invalidate_inode_page2_range() only if we > > > > have > > > > zero pages mapped over the place where we are doing I/O, which is why > > > > we check > > > > nrpages. > > > > > > > > > > Check for ->nrpages only looks strange, because > > > invalidate_inode_pages2_range() also > > > invalidates exceptional radix tree entries. Is that correct that we > > > invalidate > > > exceptional entries only if ->nrpages > 0 and skip invalidation otherwise? > > > > For DAX we only invalidate clean DAX exceptional entries so that we can keep > > dirty entries around for writeback, but yes you're correct that we only do > > the > > invalidation if nrpages > 0. And yes, it does seem a bit weird. :) > > Actually in this place the nrpages check is deliberate since there should > only be hole pages or nothing in the invalidated range - see the comment > before the if. But thinking more about it this assumption actually is not > right in presence of zero PMD entries in the radix tree. So this change > actually also fixes a possible bug for DAX but we should do it as a > separate patch with a proper changelog. Something like the attached patch. Ross? Honza -- Jan Kara SUSE Labs, CR >From da79b4b72a6fe5fcf1a554ca1ce77cb462e8a306 Mon Sep 17 00:00:00 2001 From: Jan Kara Date: Thu, 20 Apr 2017 16:38:20 +0200 Subject: [PATCH] dax: Fix inconsistency between mmap and write(2) When a process has a PMD-sized hole mapped via mmap and later allocates part of the file underlying this area using write(2), memory mappings need not be appropriately invalidated if the file has no hole pages allocated and thus view via mmap will not show the data written by write(2). Fix the problem by always invalidating memory mappings covering part of the file for which blocks got allocated. Signed-off-by: Jan Kara --- fs/dax.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 85abd741253d..da7bc44e5725 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1028,11 +1028,11 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, return -EIO; /* - * Write can allocate block for an area which has a hole page mapped - * into page tables. We have to tear down these mappings so that data - * written by write(2) is visible in mmap. + * Write can allocate block for an area which has a hole page or zero + * PMD entry in the radix tree. We have to tear down these mappings so + * that data written by write(2) is visible in mmap. */ - if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { + if (iomap->flags & IOMAP_F_NEW) { invalidate_inode_pages2_range(inode->i_mapping, pos >> PAGE_SHIFT, (end - 1) >> PAGE_SHIFT); -- 2.12.0
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On Wed 19-04-17 13:28:36, Ross Zwisler wrote: > On Wed, Apr 19, 2017 at 06:11:31PM +0300, Andrey Ryabinin wrote: > > On 04/18/2017 10:38 PM, Ross Zwisler wrote: > > > On Fri, Apr 14, 2017 at 05:07:50PM +0300, Andrey Ryabinin wrote: > > >> Some direct write fs hooks call invalidate_inode_pages2[_range]() > > >> conditionally iff mapping->nrpages is not zero. If page cache is empty, > > >> buffered read following after direct IO write would get stale data from > > >> the cleancache. > > >> > > >> Also it doesn't feel right to check only for ->nrpages because > > >> invalidate_inode_pages2[_range] invalidates exceptional entries as well. > > >> > > >> Fix this by calling invalidate_inode_pages2[_range]() regardless of > > >> nrpages > > >> state. > > >> > > >> Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") > > >> Signed-off-by: Andrey Ryabinin> > >> --- > > > <> > > >> diff --git a/fs/dax.c b/fs/dax.c > > >> index 2e382fe..1e8cca0 100644 > > >> --- a/fs/dax.c > > >> +++ b/fs/dax.c > > >> @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, > > >> loff_t length, void *data, > > >> * into page tables. We have to tear down these mappings so > > >> that data > > >> * written by write(2) is visible in mmap. > > >> */ > > >> -if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { > > >> +if ((iomap->flags & IOMAP_F_NEW)) { > > >> invalidate_inode_pages2_range(inode->i_mapping, > > >>pos >> PAGE_SHIFT, > > >>(end - 1) >> PAGE_SHIFT); > > > > > > tl;dr: I think the old code is correct, and that you don't need this > > > change. > > > > > > This should be harmless, but could slow us down a little if we keep > > > calling invalidate_inode_pages2_range() without really needing to. > > > Really for > > > DAX I think we need to call invalidate_inode_page2_range() only if we have > > > zero pages mapped over the place where we are doing I/O, which is why we > > > check > > > nrpages. > > > > > > > Check for ->nrpages only looks strange, because > > invalidate_inode_pages2_range() also > > invalidates exceptional radix tree entries. Is that correct that we > > invalidate > > exceptional entries only if ->nrpages > 0 and skip invalidation otherwise? > > For DAX we only invalidate clean DAX exceptional entries so that we can keep > dirty entries around for writeback, but yes you're correct that we only do the > invalidation if nrpages > 0. And yes, it does seem a bit weird. :) Actually in this place the nrpages check is deliberate since there should only be hole pages or nothing in the invalidated range - see the comment before the if. But thinking more about it this assumption actually is not right in presence of zero PMD entries in the radix tree. So this change actually also fixes a possible bug for DAX but we should do it as a separate patch with a proper changelog. Honza -- Jan Kara SUSE Labs, CR
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On Wed 19-04-17 13:28:36, Ross Zwisler wrote: > On Wed, Apr 19, 2017 at 06:11:31PM +0300, Andrey Ryabinin wrote: > > On 04/18/2017 10:38 PM, Ross Zwisler wrote: > > > On Fri, Apr 14, 2017 at 05:07:50PM +0300, Andrey Ryabinin wrote: > > >> Some direct write fs hooks call invalidate_inode_pages2[_range]() > > >> conditionally iff mapping->nrpages is not zero. If page cache is empty, > > >> buffered read following after direct IO write would get stale data from > > >> the cleancache. > > >> > > >> Also it doesn't feel right to check only for ->nrpages because > > >> invalidate_inode_pages2[_range] invalidates exceptional entries as well. > > >> > > >> Fix this by calling invalidate_inode_pages2[_range]() regardless of > > >> nrpages > > >> state. > > >> > > >> Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") > > >> Signed-off-by: Andrey Ryabinin > > >> --- > > > <> > > >> diff --git a/fs/dax.c b/fs/dax.c > > >> index 2e382fe..1e8cca0 100644 > > >> --- a/fs/dax.c > > >> +++ b/fs/dax.c > > >> @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, > > >> loff_t length, void *data, > > >> * into page tables. We have to tear down these mappings so > > >> that data > > >> * written by write(2) is visible in mmap. > > >> */ > > >> -if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { > > >> +if ((iomap->flags & IOMAP_F_NEW)) { > > >> invalidate_inode_pages2_range(inode->i_mapping, > > >>pos >> PAGE_SHIFT, > > >>(end - 1) >> PAGE_SHIFT); > > > > > > tl;dr: I think the old code is correct, and that you don't need this > > > change. > > > > > > This should be harmless, but could slow us down a little if we keep > > > calling invalidate_inode_pages2_range() without really needing to. > > > Really for > > > DAX I think we need to call invalidate_inode_page2_range() only if we have > > > zero pages mapped over the place where we are doing I/O, which is why we > > > check > > > nrpages. > > > > > > > Check for ->nrpages only looks strange, because > > invalidate_inode_pages2_range() also > > invalidates exceptional radix tree entries. Is that correct that we > > invalidate > > exceptional entries only if ->nrpages > 0 and skip invalidation otherwise? > > For DAX we only invalidate clean DAX exceptional entries so that we can keep > dirty entries around for writeback, but yes you're correct that we only do the > invalidation if nrpages > 0. And yes, it does seem a bit weird. :) Actually in this place the nrpages check is deliberate since there should only be hole pages or nothing in the invalidated range - see the comment before the if. But thinking more about it this assumption actually is not right in presence of zero PMD entries in the radix tree. So this change actually also fixes a possible bug for DAX but we should do it as a separate patch with a proper changelog. Honza -- Jan Kara SUSE Labs, CR
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On Wed, Apr 19, 2017 at 06:11:31PM +0300, Andrey Ryabinin wrote: > On 04/18/2017 10:38 PM, Ross Zwisler wrote: > > On Fri, Apr 14, 2017 at 05:07:50PM +0300, Andrey Ryabinin wrote: > >> Some direct write fs hooks call invalidate_inode_pages2[_range]() > >> conditionally iff mapping->nrpages is not zero. If page cache is empty, > >> buffered read following after direct IO write would get stale data from > >> the cleancache. > >> > >> Also it doesn't feel right to check only for ->nrpages because > >> invalidate_inode_pages2[_range] invalidates exceptional entries as well. > >> > >> Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages > >> state. > >> > >> Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") > >> Signed-off-by: Andrey Ryabinin> >> --- > > <> > >> diff --git a/fs/dax.c b/fs/dax.c > >> index 2e382fe..1e8cca0 100644 > >> --- a/fs/dax.c > >> +++ b/fs/dax.c > >> @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, > >> loff_t length, void *data, > >> * into page tables. We have to tear down these mappings so that data > >> * written by write(2) is visible in mmap. > >> */ > >> - if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { > >> + if ((iomap->flags & IOMAP_F_NEW)) { > >>invalidate_inode_pages2_range(inode->i_mapping, > >> pos >> PAGE_SHIFT, > >> (end - 1) >> PAGE_SHIFT); > > > > tl;dr: I think the old code is correct, and that you don't need this change. > > > > This should be harmless, but could slow us down a little if we keep > > calling invalidate_inode_pages2_range() without really needing to. Really > > for > > DAX I think we need to call invalidate_inode_page2_range() only if we have > > zero pages mapped over the place where we are doing I/O, which is why we > > check > > nrpages. > > > > Check for ->nrpages only looks strange, because > invalidate_inode_pages2_range() also > invalidates exceptional radix tree entries. Is that correct that we invalidate > exceptional entries only if ->nrpages > 0 and skip invalidation otherwise? For DAX we only invalidate clean DAX exceptional entries so that we can keep dirty entries around for writeback, but yes you're correct that we only do the invalidation if nrpages > 0. And yes, it does seem a bit weird. :)
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On Wed, Apr 19, 2017 at 06:11:31PM +0300, Andrey Ryabinin wrote: > On 04/18/2017 10:38 PM, Ross Zwisler wrote: > > On Fri, Apr 14, 2017 at 05:07:50PM +0300, Andrey Ryabinin wrote: > >> Some direct write fs hooks call invalidate_inode_pages2[_range]() > >> conditionally iff mapping->nrpages is not zero. If page cache is empty, > >> buffered read following after direct IO write would get stale data from > >> the cleancache. > >> > >> Also it doesn't feel right to check only for ->nrpages because > >> invalidate_inode_pages2[_range] invalidates exceptional entries as well. > >> > >> Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages > >> state. > >> > >> Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") > >> Signed-off-by: Andrey Ryabinin > >> --- > > <> > >> diff --git a/fs/dax.c b/fs/dax.c > >> index 2e382fe..1e8cca0 100644 > >> --- a/fs/dax.c > >> +++ b/fs/dax.c > >> @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, > >> loff_t length, void *data, > >> * into page tables. We have to tear down these mappings so that data > >> * written by write(2) is visible in mmap. > >> */ > >> - if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { > >> + if ((iomap->flags & IOMAP_F_NEW)) { > >>invalidate_inode_pages2_range(inode->i_mapping, > >> pos >> PAGE_SHIFT, > >> (end - 1) >> PAGE_SHIFT); > > > > tl;dr: I think the old code is correct, and that you don't need this change. > > > > This should be harmless, but could slow us down a little if we keep > > calling invalidate_inode_pages2_range() without really needing to. Really > > for > > DAX I think we need to call invalidate_inode_page2_range() only if we have > > zero pages mapped over the place where we are doing I/O, which is why we > > check > > nrpages. > > > > Check for ->nrpages only looks strange, because > invalidate_inode_pages2_range() also > invalidates exceptional radix tree entries. Is that correct that we invalidate > exceptional entries only if ->nrpages > 0 and skip invalidation otherwise? For DAX we only invalidate clean DAX exceptional entries so that we can keep dirty entries around for writeback, but yes you're correct that we only do the invalidation if nrpages > 0. And yes, it does seem a bit weird. :)
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On 04/19/2017 01:46 AM, Andrew Morton wrote: > On Fri, 14 Apr 2017 17:07:50 +0300 Andrey Ryabinin> wrote: > >> Some direct write fs hooks call invalidate_inode_pages2[_range]() >> conditionally iff mapping->nrpages is not zero. If page cache is empty, >> buffered read following after direct IO write would get stale data from >> the cleancache. >> >> Also it doesn't feel right to check only for ->nrpages because >> invalidate_inode_pages2[_range] invalidates exceptional entries as well. >> >> Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages >> state. > > I'm not understanding this. I can buy the argument about > nrexceptional, but why does cleancache require the > invalidate_inode_pages2_range) call even when ->nrpages is zero? > > I *assume* it's because invalidate_inode_pages2_range() calls > cleancache_invalidate_inode(), yes? If so, can we please add this to > the changelog? If not then please explain further. > Yes, your assumption is correct. I'll fix the changelog.
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On 04/19/2017 01:46 AM, Andrew Morton wrote: > On Fri, 14 Apr 2017 17:07:50 +0300 Andrey Ryabinin > wrote: > >> Some direct write fs hooks call invalidate_inode_pages2[_range]() >> conditionally iff mapping->nrpages is not zero. If page cache is empty, >> buffered read following after direct IO write would get stale data from >> the cleancache. >> >> Also it doesn't feel right to check only for ->nrpages because >> invalidate_inode_pages2[_range] invalidates exceptional entries as well. >> >> Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages >> state. > > I'm not understanding this. I can buy the argument about > nrexceptional, but why does cleancache require the > invalidate_inode_pages2_range) call even when ->nrpages is zero? > > I *assume* it's because invalidate_inode_pages2_range() calls > cleancache_invalidate_inode(), yes? If so, can we please add this to > the changelog? If not then please explain further. > Yes, your assumption is correct. I'll fix the changelog.
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On 04/18/2017 10:38 PM, Ross Zwisler wrote: > On Fri, Apr 14, 2017 at 05:07:50PM +0300, Andrey Ryabinin wrote: >> Some direct write fs hooks call invalidate_inode_pages2[_range]() >> conditionally iff mapping->nrpages is not zero. If page cache is empty, >> buffered read following after direct IO write would get stale data from >> the cleancache. >> >> Also it doesn't feel right to check only for ->nrpages because >> invalidate_inode_pages2[_range] invalidates exceptional entries as well. >> >> Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages >> state. >> >> Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") >> Signed-off-by: Andrey Ryabinin>> --- > <> >> diff --git a/fs/dax.c b/fs/dax.c >> index 2e382fe..1e8cca0 100644 >> --- a/fs/dax.c >> +++ b/fs/dax.c >> @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, >> loff_t length, void *data, >> * into page tables. We have to tear down these mappings so that data >> * written by write(2) is visible in mmap. >> */ >> -if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { >> +if ((iomap->flags & IOMAP_F_NEW)) { >> invalidate_inode_pages2_range(inode->i_mapping, >>pos >> PAGE_SHIFT, >>(end - 1) >> PAGE_SHIFT); > > tl;dr: I think the old code is correct, and that you don't need this change. > > This should be harmless, but could slow us down a little if we keep > calling invalidate_inode_pages2_range() without really needing to. Really for > DAX I think we need to call invalidate_inode_page2_range() only if we have > zero pages mapped over the place where we are doing I/O, which is why we check > nrpages. > Check for ->nrpages only looks strange, because invalidate_inode_pages2_range() also invalidates exceptional radix tree entries. Is that correct that we invalidate exceptional entries only if ->nrpages > 0 and skip invalidation otherwise? > Is DAX even allowed to be used at the same time as cleancache? From a brief > look at Documentation/vm/cleancache.txt, it seems like these two features are > incompatible. With DAX we already are avoiding the page cache completely. tl;dr: I think you're right. cleancache may store any PageUptodate && PageMappedToDisk page evicted from page cache (see __delete_from_page_cache) DAX deletes hole page via __delete_from_page_cache(), but I don't see we mark such page as Uptodate or MappedToDisk so it will never go into the cleancache. Latter cleancache_get_page() (e.g. it's called from mpage_readpages() which is called from blkdev_read_pages()) I assume that DAX doesn't use a_ops->readpages() method so cleancache_get_page() is never called from DAX. > Anyway, I don't see how this change in DAX can save us from a data corruption > (which is what you're seeing, right?), and I think it could slow us down, so > I'd prefer to leave things as they are. > I'll remove this hunk from v2. Thanks.
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On 04/18/2017 10:38 PM, Ross Zwisler wrote: > On Fri, Apr 14, 2017 at 05:07:50PM +0300, Andrey Ryabinin wrote: >> Some direct write fs hooks call invalidate_inode_pages2[_range]() >> conditionally iff mapping->nrpages is not zero. If page cache is empty, >> buffered read following after direct IO write would get stale data from >> the cleancache. >> >> Also it doesn't feel right to check only for ->nrpages because >> invalidate_inode_pages2[_range] invalidates exceptional entries as well. >> >> Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages >> state. >> >> Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") >> Signed-off-by: Andrey Ryabinin >> --- > <> >> diff --git a/fs/dax.c b/fs/dax.c >> index 2e382fe..1e8cca0 100644 >> --- a/fs/dax.c >> +++ b/fs/dax.c >> @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, >> loff_t length, void *data, >> * into page tables. We have to tear down these mappings so that data >> * written by write(2) is visible in mmap. >> */ >> -if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { >> +if ((iomap->flags & IOMAP_F_NEW)) { >> invalidate_inode_pages2_range(inode->i_mapping, >>pos >> PAGE_SHIFT, >>(end - 1) >> PAGE_SHIFT); > > tl;dr: I think the old code is correct, and that you don't need this change. > > This should be harmless, but could slow us down a little if we keep > calling invalidate_inode_pages2_range() without really needing to. Really for > DAX I think we need to call invalidate_inode_page2_range() only if we have > zero pages mapped over the place where we are doing I/O, which is why we check > nrpages. > Check for ->nrpages only looks strange, because invalidate_inode_pages2_range() also invalidates exceptional radix tree entries. Is that correct that we invalidate exceptional entries only if ->nrpages > 0 and skip invalidation otherwise? > Is DAX even allowed to be used at the same time as cleancache? From a brief > look at Documentation/vm/cleancache.txt, it seems like these two features are > incompatible. With DAX we already are avoiding the page cache completely. tl;dr: I think you're right. cleancache may store any PageUptodate && PageMappedToDisk page evicted from page cache (see __delete_from_page_cache) DAX deletes hole page via __delete_from_page_cache(), but I don't see we mark such page as Uptodate or MappedToDisk so it will never go into the cleancache. Latter cleancache_get_page() (e.g. it's called from mpage_readpages() which is called from blkdev_read_pages()) I assume that DAX doesn't use a_ops->readpages() method so cleancache_get_page() is never called from DAX. > Anyway, I don't see how this change in DAX can save us from a data corruption > (which is what you're seeing, right?), and I think it could slow us down, so > I'd prefer to leave things as they are. > I'll remove this hunk from v2. Thanks.
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On Fri, 14 Apr 2017 17:07:50 +0300 Andrey Ryabininwrote: > Some direct write fs hooks call invalidate_inode_pages2[_range]() > conditionally iff mapping->nrpages is not zero. If page cache is empty, > buffered read following after direct IO write would get stale data from > the cleancache. > > Also it doesn't feel right to check only for ->nrpages because > invalidate_inode_pages2[_range] invalidates exceptional entries as well. > > Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages > state. I'm not understanding this. I can buy the argument about nrexceptional, but why does cleancache require the invalidate_inode_pages2_range) call even when ->nrpages is zero? I *assume* it's because invalidate_inode_pages2_range() calls cleancache_invalidate_inode(), yes? If so, can we please add this to the changelog? If not then please explain further.
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On Fri, 14 Apr 2017 17:07:50 +0300 Andrey Ryabinin wrote: > Some direct write fs hooks call invalidate_inode_pages2[_range]() > conditionally iff mapping->nrpages is not zero. If page cache is empty, > buffered read following after direct IO write would get stale data from > the cleancache. > > Also it doesn't feel right to check only for ->nrpages because > invalidate_inode_pages2[_range] invalidates exceptional entries as well. > > Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages > state. I'm not understanding this. I can buy the argument about nrexceptional, but why does cleancache require the invalidate_inode_pages2_range) call even when ->nrpages is zero? I *assume* it's because invalidate_inode_pages2_range() calls cleancache_invalidate_inode(), yes? If so, can we please add this to the changelog? If not then please explain further.
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On Fri, Apr 14, 2017 at 05:07:50PM +0300, Andrey Ryabinin wrote: > Some direct write fs hooks call invalidate_inode_pages2[_range]() > conditionally iff mapping->nrpages is not zero. If page cache is empty, > buffered read following after direct IO write would get stale data from > the cleancache. > > Also it doesn't feel right to check only for ->nrpages because > invalidate_inode_pages2[_range] invalidates exceptional entries as well. > > Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages > state. > > Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") > Signed-off-by: Andrey Ryabinin> --- <> > diff --git a/fs/dax.c b/fs/dax.c > index 2e382fe..1e8cca0 100644 > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t > length, void *data, >* into page tables. We have to tear down these mappings so that data >* written by write(2) is visible in mmap. >*/ > - if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { > + if ((iomap->flags & IOMAP_F_NEW)) { > invalidate_inode_pages2_range(inode->i_mapping, > pos >> PAGE_SHIFT, > (end - 1) >> PAGE_SHIFT); tl;dr: I think the old code is correct, and that you don't need this change. This should be harmless, but could slow us down a little if we keep calling invalidate_inode_pages2_range() without really needing to. Really for DAX I think we need to call invalidate_inode_page2_range() only if we have zero pages mapped over the place where we are doing I/O, which is why we check nrpages. Is DAX even allowed to be used at the same time as cleancache? From a brief look at Documentation/vm/cleancache.txt, it seems like these two features are incompatible. With DAX we already are avoiding the page cache completely. Anyway, I don't see how this change in DAX can save us from a data corruption (which is what you're seeing, right?), and I think it could slow us down, so I'd prefer to leave things as they are.
Re: [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
On Fri, Apr 14, 2017 at 05:07:50PM +0300, Andrey Ryabinin wrote: > Some direct write fs hooks call invalidate_inode_pages2[_range]() > conditionally iff mapping->nrpages is not zero. If page cache is empty, > buffered read following after direct IO write would get stale data from > the cleancache. > > Also it doesn't feel right to check only for ->nrpages because > invalidate_inode_pages2[_range] invalidates exceptional entries as well. > > Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages > state. > > Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") > Signed-off-by: Andrey Ryabinin > --- <> > diff --git a/fs/dax.c b/fs/dax.c > index 2e382fe..1e8cca0 100644 > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t > length, void *data, >* into page tables. We have to tear down these mappings so that data >* written by write(2) is visible in mmap. >*/ > - if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { > + if ((iomap->flags & IOMAP_F_NEW)) { > invalidate_inode_pages2_range(inode->i_mapping, > pos >> PAGE_SHIFT, > (end - 1) >> PAGE_SHIFT); tl;dr: I think the old code is correct, and that you don't need this change. This should be harmless, but could slow us down a little if we keep calling invalidate_inode_pages2_range() without really needing to. Really for DAX I think we need to call invalidate_inode_page2_range() only if we have zero pages mapped over the place where we are doing I/O, which is why we check nrpages. Is DAX even allowed to be used at the same time as cleancache? From a brief look at Documentation/vm/cleancache.txt, it seems like these two features are incompatible. With DAX we already are avoiding the page cache completely. Anyway, I don't see how this change in DAX can save us from a data corruption (which is what you're seeing, right?), and I think it could slow us down, so I'd prefer to leave things as they are.
[PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
Some direct write fs hooks call invalidate_inode_pages2[_range]() conditionally iff mapping->nrpages is not zero. If page cache is empty, buffered read following after direct IO write would get stale data from the cleancache. Also it doesn't feel right to check only for ->nrpages because invalidate_inode_pages2[_range] invalidates exceptional entries as well. Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages state. Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") Signed-off-by: Andrey Ryabinin--- fs/9p/vfs_file.c | 2 +- fs/cifs/inode.c | 2 +- fs/dax.c | 2 +- fs/iomap.c | 16 +++- fs/nfs/direct.c | 6 ++ fs/nfs/inode.c | 8 +--- mm/filemap.c | 26 +++--- 7 files changed, 28 insertions(+), 34 deletions(-) diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c index 3de3b4a8..786d0de 100644 --- a/fs/9p/vfs_file.c +++ b/fs/9p/vfs_file.c @@ -423,7 +423,7 @@ v9fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from) unsigned long pg_start, pg_end; pg_start = origin >> PAGE_SHIFT; pg_end = (origin + retval - 1) >> PAGE_SHIFT; - if (inode->i_mapping && inode->i_mapping->nrpages) + if (inode->i_mapping) invalidate_inode_pages2_range(inode->i_mapping, pg_start, pg_end); iocb->ki_pos += retval; diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index c3b2fa0..6539fa3 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -1857,7 +1857,7 @@ cifs_invalidate_mapping(struct inode *inode) { int rc = 0; - if (inode->i_mapping && inode->i_mapping->nrpages != 0) { + if (inode->i_mapping) { rc = invalidate_inode_pages2(inode->i_mapping); if (rc) cifs_dbg(VFS, "%s: could not invalidate inode %p\n", diff --git a/fs/dax.c b/fs/dax.c index 2e382fe..1e8cca0 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, * into page tables. We have to tear down these mappings so that data * written by write(2) is visible in mmap. */ - if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { + if ((iomap->flags & IOMAP_F_NEW)) { invalidate_inode_pages2_range(inode->i_mapping, pos >> PAGE_SHIFT, (end - 1) >> PAGE_SHIFT); diff --git a/fs/iomap.c b/fs/iomap.c index 0b457ff..7e1f947 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -880,16 +880,14 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, flags |= IOMAP_WRITE; } - if (mapping->nrpages) { - ret = filemap_write_and_wait_range(mapping, start, end); - if (ret) - goto out_free_dio; + ret = filemap_write_and_wait_range(mapping, start, end); + if (ret) + goto out_free_dio; - ret = invalidate_inode_pages2_range(mapping, + ret = invalidate_inode_pages2_range(mapping, start >> PAGE_SHIFT, end >> PAGE_SHIFT); - WARN_ON_ONCE(ret); - ret = 0; - } + WARN_ON_ONCE(ret); + ret = 0; inode_dio_begin(inode); @@ -944,7 +942,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, * one is a pretty crazy thing to do, so we don't support it 100%. If * this invalidation fails, tough, the write still worked... */ - if (iov_iter_rw(iter) == WRITE && mapping->nrpages) { + if (iov_iter_rw(iter) == WRITE) { int err = invalidate_inode_pages2_range(mapping, start >> PAGE_SHIFT, end >> PAGE_SHIFT); WARN_ON_ONCE(err); diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c index aab32fc..183ab4d 100644 --- a/fs/nfs/direct.c +++ b/fs/nfs/direct.c @@ -1024,10 +1024,8 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter) result = nfs_direct_write_schedule_iovec(dreq, iter, pos); - if (mapping->nrpages) { - invalidate_inode_pages2_range(mapping, - pos >> PAGE_SHIFT, end); - } + invalidate_inode_pages2_range(mapping, + pos >> PAGE_SHIFT, end); nfs_end_io_direct(inode); diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index f489a5a..b727ec8 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -1118,10 +1118,12 @@ static int nfs_invalidate_mapping(struct inode *inode, struct address_space *map if (ret < 0) return ret; } - ret = invalidate_inode_pages2(mapping); -
[PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO
Some direct write fs hooks call invalidate_inode_pages2[_range]() conditionally iff mapping->nrpages is not zero. If page cache is empty, buffered read following after direct IO write would get stale data from the cleancache. Also it doesn't feel right to check only for ->nrpages because invalidate_inode_pages2[_range] invalidates exceptional entries as well. Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages state. Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") Signed-off-by: Andrey Ryabinin --- fs/9p/vfs_file.c | 2 +- fs/cifs/inode.c | 2 +- fs/dax.c | 2 +- fs/iomap.c | 16 +++- fs/nfs/direct.c | 6 ++ fs/nfs/inode.c | 8 +--- mm/filemap.c | 26 +++--- 7 files changed, 28 insertions(+), 34 deletions(-) diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c index 3de3b4a8..786d0de 100644 --- a/fs/9p/vfs_file.c +++ b/fs/9p/vfs_file.c @@ -423,7 +423,7 @@ v9fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from) unsigned long pg_start, pg_end; pg_start = origin >> PAGE_SHIFT; pg_end = (origin + retval - 1) >> PAGE_SHIFT; - if (inode->i_mapping && inode->i_mapping->nrpages) + if (inode->i_mapping) invalidate_inode_pages2_range(inode->i_mapping, pg_start, pg_end); iocb->ki_pos += retval; diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index c3b2fa0..6539fa3 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -1857,7 +1857,7 @@ cifs_invalidate_mapping(struct inode *inode) { int rc = 0; - if (inode->i_mapping && inode->i_mapping->nrpages != 0) { + if (inode->i_mapping) { rc = invalidate_inode_pages2(inode->i_mapping); if (rc) cifs_dbg(VFS, "%s: could not invalidate inode %p\n", diff --git a/fs/dax.c b/fs/dax.c index 2e382fe..1e8cca0 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1047,7 +1047,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, * into page tables. We have to tear down these mappings so that data * written by write(2) is visible in mmap. */ - if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { + if ((iomap->flags & IOMAP_F_NEW)) { invalidate_inode_pages2_range(inode->i_mapping, pos >> PAGE_SHIFT, (end - 1) >> PAGE_SHIFT); diff --git a/fs/iomap.c b/fs/iomap.c index 0b457ff..7e1f947 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -880,16 +880,14 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, flags |= IOMAP_WRITE; } - if (mapping->nrpages) { - ret = filemap_write_and_wait_range(mapping, start, end); - if (ret) - goto out_free_dio; + ret = filemap_write_and_wait_range(mapping, start, end); + if (ret) + goto out_free_dio; - ret = invalidate_inode_pages2_range(mapping, + ret = invalidate_inode_pages2_range(mapping, start >> PAGE_SHIFT, end >> PAGE_SHIFT); - WARN_ON_ONCE(ret); - ret = 0; - } + WARN_ON_ONCE(ret); + ret = 0; inode_dio_begin(inode); @@ -944,7 +942,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, * one is a pretty crazy thing to do, so we don't support it 100%. If * this invalidation fails, tough, the write still worked... */ - if (iov_iter_rw(iter) == WRITE && mapping->nrpages) { + if (iov_iter_rw(iter) == WRITE) { int err = invalidate_inode_pages2_range(mapping, start >> PAGE_SHIFT, end >> PAGE_SHIFT); WARN_ON_ONCE(err); diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c index aab32fc..183ab4d 100644 --- a/fs/nfs/direct.c +++ b/fs/nfs/direct.c @@ -1024,10 +1024,8 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter) result = nfs_direct_write_schedule_iovec(dreq, iter, pos); - if (mapping->nrpages) { - invalidate_inode_pages2_range(mapping, - pos >> PAGE_SHIFT, end); - } + invalidate_inode_pages2_range(mapping, + pos >> PAGE_SHIFT, end); nfs_end_io_direct(inode); diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index f489a5a..b727ec8 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -1118,10 +1118,12 @@ static int nfs_invalidate_mapping(struct inode *inode, struct address_space *map if (ret < 0) return ret; } - ret = invalidate_inode_pages2(mapping); - if (ret < 0) -