Re: [PATCH 3/3] mm/fs: fix pessimization in hole-punching pagecache
On 07/02/2014 09:13 PM, Hugh Dickins wrote: I wanted to revert my v3.1 commit d0823576bf4b ("mm: pincer in truncate_inode_pages_range"), to keep truncate_inode_pages_range() in synch with shmem_undo_range(); but have stepped back - a change to hole-punching in truncate_inode_pages_range() is a change to hole-punching in every filesystem (except tmpfs) that supports it. If there's a logical proof why no filesystem can depend for its own correctness on the pincer guarantee in truncate_inode_pages_range() - an instant when the entire hole is removed from pagecache - then let's revisit later. But the evidence is that only tmpfs suffered from the livelock, and we have no intention of extending hole-punch to ramfs. So for now just add a few comments (to match or differ from those in shmem_undo_range()), and fix one silliness noticed in d0823576bf4b... Its "index == start" addition to the hole-punch termination test was incomplete: it opened a way for the end condition to be missed, and the loop go on looking through the radix_tree, all the way to end of file. Fix that pessimization by resetting index when detected in inner loop. Note that it's actually hard to hit this case, without the obsessive concurrent faulting that trinity does: normally all pages are removed in the initial trylock_page() pass, and this loop finds nothing to do. I had to "#if 0" out the initial pass to reproduce bug and test fix. Signed-off-by: Hugh Dickins Cc: Sasha Levin Acked-by: Vlastimil Babka Cc: Konstantin Khlebnikov Cc: Lukas Czerner Cc: Dave Jones --- mm/truncate.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) --- 3.16-rc3/mm/truncate.c 2014-06-08 11:19:54.0 -0700 +++ linux/mm/truncate.c 2014-07-02 03:36:05.972553533 -0700 @@ -355,14 +355,16 @@ void truncate_inode_pages_range(struct a for ( ; ; ) { cond_resched(); if (!pagevec_lookup_entries(, mapping, index, - min(end - index, (pgoff_t)PAGEVEC_SIZE), - indices)) { + min(end - index, (pgoff_t)PAGEVEC_SIZE), indices)) { + /* If all gone from start onwards, we're done */ if (index == start) break; + /* Otherwise restart to make sure all gone */ index = start; continue; } if (index == start && indices[0] >= end) { + /* All gone out of hole to be punched, we're done */ pagevec_remove_exceptionals(); pagevec_release(); break; @@ -373,8 +375,11 @@ void truncate_inode_pages_range(struct a /* We rely upon deletion not changing page->index */ index = indices[i]; - if (index >= end) + if (index >= end) { + /* Restart punch to make sure all gone */ + index = start - 1; break; + } if (radix_tree_exceptional_entry(page)) { clear_exceptional_entry(mapping, index, page); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] mm/fs: fix pessimization in hole-punching pagecache
On 07/02/2014 09:13 PM, Hugh Dickins wrote: I wanted to revert my v3.1 commit d0823576bf4b (mm: pincer in truncate_inode_pages_range), to keep truncate_inode_pages_range() in synch with shmem_undo_range(); but have stepped back - a change to hole-punching in truncate_inode_pages_range() is a change to hole-punching in every filesystem (except tmpfs) that supports it. If there's a logical proof why no filesystem can depend for its own correctness on the pincer guarantee in truncate_inode_pages_range() - an instant when the entire hole is removed from pagecache - then let's revisit later. But the evidence is that only tmpfs suffered from the livelock, and we have no intention of extending hole-punch to ramfs. So for now just add a few comments (to match or differ from those in shmem_undo_range()), and fix one silliness noticed in d0823576bf4b... Its index == start addition to the hole-punch termination test was incomplete: it opened a way for the end condition to be missed, and the loop go on looking through the radix_tree, all the way to end of file. Fix that pessimization by resetting index when detected in inner loop. Note that it's actually hard to hit this case, without the obsessive concurrent faulting that trinity does: normally all pages are removed in the initial trylock_page() pass, and this loop finds nothing to do. I had to #if 0 out the initial pass to reproduce bug and test fix. Signed-off-by: Hugh Dickins hu...@google.com Cc: Sasha Levin sasha.le...@oracle.com Acked-by: Vlastimil Babka vba...@suse.cz Cc: Konstantin Khlebnikov koc...@gmail.com Cc: Lukas Czerner lczer...@redhat.com Cc: Dave Jones da...@redhat.com --- mm/truncate.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) --- 3.16-rc3/mm/truncate.c 2014-06-08 11:19:54.0 -0700 +++ linux/mm/truncate.c 2014-07-02 03:36:05.972553533 -0700 @@ -355,14 +355,16 @@ void truncate_inode_pages_range(struct a for ( ; ; ) { cond_resched(); if (!pagevec_lookup_entries(pvec, mapping, index, - min(end - index, (pgoff_t)PAGEVEC_SIZE), - indices)) { + min(end - index, (pgoff_t)PAGEVEC_SIZE), indices)) { + /* If all gone from start onwards, we're done */ if (index == start) break; + /* Otherwise restart to make sure all gone */ index = start; continue; } if (index == start indices[0] = end) { + /* All gone out of hole to be punched, we're done */ pagevec_remove_exceptionals(pvec); pagevec_release(pvec); break; @@ -373,8 +375,11 @@ void truncate_inode_pages_range(struct a /* We rely upon deletion not changing page-index */ index = indices[i]; - if (index = end) + if (index = end) { + /* Restart punch to make sure all gone */ + index = start - 1; break; + } if (radix_tree_exceptional_entry(page)) { clear_exceptional_entry(mapping, index, page); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/