On Wed 31-10-12 06:47:56, Greg KH wrote:
> On Wed, Oct 31, 2012 at 12:01:01PM +0100, Jan Kara wrote:
> >   Hello,
> > 
> >   it seems commit ef5d437f didn't get to 3.0-stable tree although it was
> > marked for it?
> 
> It didn't apply there, care to provide a backported version?
  Sure. Attached.

                                                                Honza
-- 
Jan Kara <[email protected]>
SUSE Labs, CR
From: Jan Kara <[email protected]>
Date: Wed, 26 Sep 2012 17:00:19 +0200
Subject: [PATCH] mm: Fix XFS oops due to dirty pages without buffers on s390
References: bnc#762259
Patch-mainline: 3.7-rc3
Git-commit: ef5d437f71afdf4afdbab99213add99f4b1318fd

On s390 any write to a page (even from kernel itself) sets architecture
specific page dirty bit. Thus when a page is written to via buffered write, HW
dirty bit gets set and when we later map and unmap the page, page_remove_rmap()
finds the dirty bit and calls set_page_dirty().

Dirtying of a page which shouldn't be dirty can cause all sorts of problems to
filesystems. The bug we observed in practice is that buffers from the page get
freed, so when the page gets later marked as dirty and writeback writes it, XFS
crashes due to an assertion BUG_ON(!PagePrivate(page)) in page_buffers() called
from xfs_count_page_state().

Similar problem can also happen when zero_user_segment() call from
xfs_vm_writepage() (or block_write_full_page() for that matter) set the
hardware dirty bit during writeback, later buffers get freed, and then page
unmapped.

Fix the issue by ignoring s390 HW dirty bit for page cache pages of mappings
with mapping_cap_account_dirty(). This is safe because for such mappings when a
page gets marked as writeable in PTE it is also marked dirty in do_wp_page() or
do_page_fault(). When the dirty bit is cleared by clear_page_dirty_for_io(),
the page gets writeprotected in page_mkclean(). So pagecache page is writeable
if and only if it is dirty.

Thanks to Hugh Dickins <[email protected]> for pointing out mapping has to have
mapping_cap_account_dirty() for things to work and proposing a cleaned up
variant of the patch.

CC: Martin Schwidefsky <[email protected]>
CC: Mel Gorman <[email protected]>
CC: [email protected]
Signed-off-by: Jan Kara <[email protected]>
---
 mm/rmap.c |   20 +++++++++++++++-----
 1 files changed, 15 insertions(+), 5 deletions(-)

Index: linux-3.0-SLE11-SP2-s390_dirty/mm/rmap.c
===================================================================
--- linux-3.0-SLE11-SP2-s390_dirty.orig/mm/rmap.c
+++ linux-3.0-SLE11-SP2-s390_dirty/mm/rmap.c
@@ -57,6 +57,7 @@
 #include <linux/mmu_notifier.h>
 #include <linux/migrate.h>
 #include <linux/hugetlb.h>
+#include <linux/backing-dev.h>
 
 #include <asm/tlbflush.h>
 
@@ -937,11 +938,8 @@ int page_mkclean(struct page *page)
 
 	if (page_mapped(page)) {
 		struct address_space *mapping = page_mapping(page);
-		if (mapping) {
+		if (mapping)
 			ret = page_mkclean_file(mapping, page);
-			if (page_test_and_clear_dirty(page_to_pfn(page), 1))
-				ret = 1;
-		}
 	}
 
 	return ret;
@@ -1122,6 +1120,8 @@ void page_add_file_rmap(struct page *pag
  */
 void page_remove_rmap(struct page *page)
 {
+	struct address_space *mapping = page_mapping(page);
+
 	/* page still mapped by someone else? */
 	if (!atomic_add_negative(-1, &page->_mapcount))
 		return;
@@ -1132,8 +1132,19 @@ void page_remove_rmap(struct page *page)
 	 * this if the page is anon, so about to be freed; but perhaps
 	 * not if it's in swapcache - there might be another pte slot
 	 * containing the swap entry, but page not yet written to swap.
+	 *
+	 * And we can skip it on file pages, so long as the filesystem
+	 * participates in dirty tracking; but need to catch shm and tmpfs
+	 * and ramfs pages which have been modified since creation by read
+	 * fault.
+	 *
+	 * Note that mapping must be decided above, before decrementing
+	 * mapcount (which luckily provides a barrier): once page is unmapped,
+	 * it could be truncated and page->mapping reset to NULL at any moment.
+	 * Note also that we are relying on page_mapping(page) to set mapping
+	 * to &swapper_space when PageSwapCache(page).
 	 */
-	if ((!PageAnon(page) || PageSwapCache(page)) &&
+	if (mapping && !mapping_cap_account_dirty(mapping) &&
 	    page_test_and_clear_dirty(page_to_pfn(page), 1))
 		set_page_dirty(page);
 	/*

Reply via email to