Re: [PATCH] nilfs2: fix data loss with mmap()
On Wed, 17 Sep 2014 21:34:39 +0900 (JST), Ryusuke Konishi wrote: On Wed, 17 Sep 2014 10:16:46 +0200, Andreas Rohner wrote: On 2014-09-16 15:57, Ryusuke Konishi wrote: On Tue, 16 Sep 2014 10:38:29 +0200, Andreas Rohner wrote: I'd appreciate your help on testing the patch for some old kernels. (And, please declare a Tested-by tag in the reply mail, if the test is ok). Sure I have everything set up. Which kernels do I have to test? Was commit 136e877 backported? I presume at least stable and some of the longterm kernels on https://www.kernel.org/... The commit 136e877 was merged to v3.10 and backported to stable trees of earlier kernels. But, most of earlier stable trees are no longer maintained. Well maintained trees are the following longterm kernels: - 3.4.y (backported commit 136e877) - 3.10.y - 3.14.y I think these three kernels are worty to be tested. I tested it on all stable kernels including 3.4.x, 3.10.x, 3.14.x. The bug is present in all of them and the patch fixes it. The patch also applies cleanly on all kernels. I sent it again yesterday, and added the Tested-by: tag. One thing I have a question. Is the original issue that commit 136e877 fixed still OK ? If you haven't tested it, I apprecicate if you examine the test for the prior issue. Thanks, Ryusuke Konishi -- To unsubscribe from this list: send the line unsubscribe linux-nilfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] nilfs2: fix data loss with mmap()
On 2014-09-18 09:24, Ryusuke Konishi wrote: On Wed, 17 Sep 2014 21:34:39 +0900 (JST), Ryusuke Konishi wrote: On Wed, 17 Sep 2014 10:16:46 +0200, Andreas Rohner wrote: On 2014-09-16 15:57, Ryusuke Konishi wrote: On Tue, 16 Sep 2014 10:38:29 +0200, Andreas Rohner wrote: I'd appreciate your help on testing the patch for some old kernels. (And, please declare a Tested-by tag in the reply mail, if the test is ok). Sure I have everything set up. Which kernels do I have to test? Was commit 136e877 backported? I presume at least stable and some of the longterm kernels on https://www.kernel.org/... The commit 136e877 was merged to v3.10 and backported to stable trees of earlier kernels. But, most of earlier stable trees are no longer maintained. Well maintained trees are the following longterm kernels: - 3.4.y (backported commit 136e877) - 3.10.y - 3.14.y I think these three kernels are worty to be tested. I tested it on all stable kernels including 3.4.x, 3.10.x, 3.14.x. The bug is present in all of them and the patch fixes it. The patch also applies cleanly on all kernels. I sent it again yesterday, and added the Tested-by: tag. One thing I have a question. Is the original issue that commit 136e877 fixed still OK ? If you haven't tested it, I apprecicate if you examine the test for the prior issue. Yes the original issue is still fixed. I was able to reproduce it by reverting nilfs_set_page_dirty() to the state prior to commit 136e877. Then I used the new version of nilfs_set_page_dirty() including my patch and I couldn't reproduce the issue anymore. So it seems to still be fixed. Furthermore the original issue was caused by the use of __set_page_dirty_buffers(), which marked unmapped buffers as dirty. My patch does not change the fix for that. br, Andreas Rohner -- To unsubscribe from this list: send the line unsubscribe linux-nilfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] nilfs2: fix data loss with mmap()
From: Andreas Rohner andreas.roh...@gmx.net This bug leads to reproducible silent data loss, despite the use of msync(), sync() and a clean unmount of the file system. It is easily reproducible with the following script: [BEGIN SCRIPT] mkfs.nilfs2 -f /dev/sdb mount /dev/sdb /mnt dd if=/dev/zero bs=1M count=30 of=/mnt/testfile umount /mnt mount /dev/sdb /mnt CHECKSUM_BEFORE=$(md5sum /mnt/testfile) /root/mmaptest/mmaptest /mnt/testfile 30 10 5 sync CHECKSUM_AFTER=$(md5sum /mnt/testfile) umount /mnt mount /dev/sdb /mnt CHECKSUM_AFTER_REMOUNT=$(md5sum /mnt/testfile) umount /mnt echo BEFORE MMAP:\t$CHECKSUM_BEFORE echo AFTER MMAP:\t$CHECKSUM_AFTER echo AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT [END SCRIPT] The mmaptest tool looks something like this (very simplified, with error checking removed): [BEGIN mmaptest] data = mmap(NULL, file_size - file_offset, PROT_READ | PROT_WRITE, MAP_SHARED, fd, file_offset); for (i = 0; i write_count; ++i) { memcpy(data + i * 4096, buf, sizeof(buf)); msync(data, file_size - file_offset, MS_SYNC)) } [END mmaptest] The output of the script looks something like this: BEFORE MMAP:281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile AFTER MMAP: 6604a1c31f10780331a6850371b3a313 /mnt/testfile AFTER REMOUNT: 281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile So it is clear, that the changes done using mmap() do not survive a remount. This can be reproduced a 100% of the time. The problem was introduced with the following commit: 136e877 nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary If the page was read with mpage_readpage() or mpage_readpages() for example, then it has no buffers attached to it. In that case page_has_buffers(page) in nilfs_set_page_dirty() will be false. Therefore nilfs_set_file_dirty() is never called and the pages are never collected and never written to disk. This patch fixes the problem by also calling nilfs_set_file_dirty() if the page has no buffers attached to it. Signed-off-by: Andreas Rohner andreas.roh...@gmx.net Tested-by: Andreas Rohner andreas.roh...@gmx.net Signed-off-by: Ryusuke Konishi konishi.ryus...@lab.ntt.co.jp Cc: sta...@vger.kernel.org --- fs/nilfs2/inode.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c index 6252b17..9e3c525 100644 --- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@ -219,10 +219,10 @@ static int nilfs_writepage(struct page *page, struct writeback_control *wbc) static int nilfs_set_page_dirty(struct page *page) { + struct inode *inode = page-mapping-host; int ret = __set_page_dirty_nobuffers(page); if (page_has_buffers(page)) { - struct inode *inode = page-mapping-host; unsigned nr_dirty = 0; struct buffer_head *bh, *head; @@ -245,6 +245,10 @@ static int nilfs_set_page_dirty(struct page *page) if (nr_dirty) nilfs_set_file_dirty(inode, nr_dirty); + } else if (ret) { + unsigned nr_dirty = 1 (PAGE_SHIFT - inode-i_blkbits); + + nilfs_set_file_dirty(inode, nr_dirty); } return ret; } -- 1.7.9.4 -- To unsubscribe from this list: send the line unsubscribe linux-nilfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] nilfs2: fix data loss with mmap()
On 2014-09-16 15:57, Ryusuke Konishi wrote: On Tue, 16 Sep 2014 10:38:29 +0200, Andreas Rohner wrote: I'd appreciate your help on testing the patch for some old kernels. (And, please declare a Tested-by tag in the reply mail, if the test is ok). Sure I have everything set up. Which kernels do I have to test? Was commit 136e877 backported? I presume at least stable and some of the longterm kernels on https://www.kernel.org/... The commit 136e877 was merged to v3.10 and backported to stable trees of earlier kernels. But, most of earlier stable trees are no longer maintained. Well maintained trees are the following longterm kernels: - 3.4.y (backported commit 136e877) - 3.10.y - 3.14.y I think these three kernels are worty to be tested. I tested it on all stable kernels including 3.4.x, 3.10.x, 3.14.x. The bug is present in all of them and the patch fixes it. The patch also applies cleanly on all kernels. I sent it again yesterday, and added the Tested-by: tag. br, Andreas Rohner -- To unsubscribe from this list: send the line unsubscribe linux-nilfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] nilfs2: fix data loss with mmap()
On 2014-09-16 06:42, Ryusuke Konishi wrote: On Tue, 16 Sep 2014 00:24:05 +0200, Andreas Rohner wrote: On 2014-09-16 00:01, Ryusuke Konishi wrote: Hi Andreas, On Mon, 15 Sep 2014 21:47:30 +0200, Andreas Rohner wrote: This bug leads to reproducible silent data loss, despite the use of msync(), sync() and a clean unmount of the file system. It is easily reproducible with the following script: snip Thank you for reporting this issue. I just stumbled upon the weird behaviour of mmap() while testing the nilfs_sync_fs() patch. I'd like to look into this patch, it looks to point out an important regression, but it may take some time since I am quite busy this week.. Of course. I understand. The patch looks correct. It is my mistake that the commit 136e877 leaked consideration for the case where the page doesn't have buffer heads. This fix should be backported to stable kernels. (I'll add a Cc: stable tag when sending this to Andrew.) Did you confirm that the patch works as expected ? Yes at least with the current master kernel: BEFORE MMAP:281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile AFTER MMAP: 3d9183f1c471b9baff15c9cc8d12c303 /mnt/testfile AFTER REMOUNT: 3d9183f1c471b9baff15c9cc8d12c303 /mnt/testfile For the record here is the little tool I used for testing mmap: https://github.com/zeitgeist87/mmaptest It writes pages with random bytes at a certain offset using mmap. The frequent umounts and mounts in the test script are necessary to purge the page cache. There is probably a better way of doing that. I'd appreciate your help on testing the patch for some old kernels. (And, please declare a Tested-by tag in the reply mail, if the test is ok). Sure I have everything set up. Which kernels do I have to test? Was commit 136e877 backported? I presume at least stable and some of the longterm kernels on https://www.kernel.org/... By the way thanks for your continued effort and time investment in reviewing my patches. br, Andreas Rohner -- To unsubscribe from this list: send the line unsubscribe linux-nilfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] nilfs2: fix data loss with mmap()
This bug leads to reproducible silent data loss, despite the use of msync(), sync() and a clean unmount of the file system. It is easily reproducible with the following script: [BEGIN SCRIPT] mkfs.nilfs2 -f /dev/sdb mount /dev/sdb /mnt # create 30MB testfile dd if=/dev/zero bs=1M count=30 of=/mnt/testfile umount /mnt mount /dev/sdb /mnt CHECKSUM_BEFORE=$(md5sum /mnt/testfile) # simple tool that opens /mnt/testfile and # writes a few blocks using mmap at a 5MB offset /root/mmaptest/mmaptest /mnt/testfile 30 10 5 sync CHECKSUM_AFTER=$(md5sum /mnt/testfile) umount /mnt mount /dev/sdb /mnt CHECKSUM_AFTER_REMOUNT=$(md5sum /mnt/testfile) umount /mnt echo BEFORE MMAP:\t$CHECKSUM_BEFORE echo AFTER MMAP:\t$CHECKSUM_AFTER echo AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT [END SCRIPT] The mmaptest tool looks something like this (very simplified, with error checking removed): [BEGIN mmaptest] data = mmap(NULL, file_size - file_offset, PROT_READ | PROT_WRITE, MAP_SHARED, fd, file_offset); for (i = 0; i write_count; ++i) { memcpy(data + i * 4096, buf, sizeof(buf)); msync(data, file_size - file_offset, MS_SYNC)) } [END mmaptest] The output of the script looks something like this: BEFORE MMAP:281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile AFTER MMAP: 6604a1c31f10780331a6850371b3a313 /mnt/testfile AFTER REMOUNT: 281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile So it is clear, that the changes done using mmap() do not survive a remount. This can be reproduced a 100% of the time. The problem was introduced with the following commit: 136e877 nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary If the page was read with mpage_readpage() or mpage_readpages() for example, then it has no buffers attached to it. In that case page_has_buffers(page) in nilfs_set_page_dirty() will be false. Therefore nilfs_set_file_dirty() is never called and the pages are never collected and never written to disk. This patch fixes the problem by also calling nilfs_set_file_dirty() if the page has no buffers attached to it. Signed-off-by: Andreas Rohner andreas.roh...@gmx.net Tested-by: Andreas Rohner andreas.roh...@gmx.net --- fs/nilfs2/inode.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c index 6252b17..9e3c525 100644 --- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@ -219,10 +219,10 @@ static int nilfs_writepage(struct page *page, struct writeback_control *wbc) static int nilfs_set_page_dirty(struct page *page) { + struct inode *inode = page-mapping-host; int ret = __set_page_dirty_nobuffers(page); if (page_has_buffers(page)) { - struct inode *inode = page-mapping-host; unsigned nr_dirty = 0; struct buffer_head *bh, *head; @@ -245,6 +245,10 @@ static int nilfs_set_page_dirty(struct page *page) if (nr_dirty) nilfs_set_file_dirty(inode, nr_dirty); + } else if (ret) { + unsigned nr_dirty = 1 (PAGE_SHIFT - inode-i_blkbits); + + nilfs_set_file_dirty(inode, nr_dirty); } return ret; } -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-nilfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] nilfs2: fix data loss with mmap()
On 2014-09-16 00:01, Ryusuke Konishi wrote: Hi Andreas, On Mon, 15 Sep 2014 21:47:30 +0200, Andreas Rohner wrote: This bug leads to reproducible silent data loss, despite the use of msync(), sync() and a clean unmount of the file system. It is easily reproducible with the following script: snip Thank you for reporting this issue. I just stumbled upon the weird behaviour of mmap() while testing the nilfs_sync_fs() patch. I'd like to look into this patch, it looks to point out an important regression, but it may take some time since I am quite busy this week.. Of course. I understand. br, Andreas Rohner -- To unsubscribe from this list: send the line unsubscribe linux-nilfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] nilfs2: fix data loss in mmap page write for hole blocks
Please consider including the attached patch to stable trees between 2.6.30 and 2.6.37. This is the modified version of the following commit which is already in upstream. commit: 34094537943113467faee98fe67c8a3d3f9a0a8b From: Ryusuke Konishi konishi.ryusuke at lab.ntt.co.jp Date: Sun, 27 Mar 2011 22:50:49 +0900 Subject: [PATCH] nilfs2: fix data loss in mmap page write for hole blocks This commit is applicable to 2.6.38, but is not to prior kernels as is. It will cause a build error against older kernels due to a change of nilfs_set_file_dirty function. I tested the attached patch for every kernel through from 2.6.30 to 2.6.37. It was all right except that 3-way merge was required for 2.6.30 and 2.6.31. Thanks, Ryusuke Konishi -- From: Ryusuke Konishi konishi.ryus...@lab.ntt.co.jp nilfs2: fix data loss in mmap page write for hole blocks From the result of a function test of mmap, mmap write to shared pages turned out to be broken for hole blocks. It doesn't write out filled blocks and the data will be lost after umount. This is due to a bug that the target file is not queued for log writer when filling hole blocks. Also, nilfs_page_mkwrite function exits normal code path even after successfully filled hole blocks due to a change of block_page_mkwrite function; just after nilfs was merged into the mainline, block_page_mkwrite() started to return VM_FAULT_LOCKED instead of zero by the patch mm: close page_mkwrite races (commit: b827e496c893de0c). The current nilfs_page_mkwrite() is not handling this value properly. This corrects nilfs_page_mkwrite() and will resolve the data loss problem in mmap write. [A minor adjustment was applied for 2.6.37 and prior kernels] Signed-off-by: Ryusuke Konishi konishi.ryus...@lab.ntt.co.jp Tested-by: Ryusuke Konishi konishi.ryus...@lab.ntt.co.jp Cc: stable sta...@kernel.org [2.6.30-2.6.37] --- fs/nilfs2/file.c | 12 ++-- 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/nilfs2/file.c b/fs/nilfs2/file.c index c9a30d7..d9d5d81 100644 --- a/fs/nilfs2/file.c +++ b/fs/nilfs2/file.c @@ -72,10 +72,9 @@ static int nilfs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf) /* * check to see if the page is mapped already (no holes) */ - if (PageMappedToDisk(page)) { - unlock_page(page); + if (PageMappedToDisk(page)) goto mapped; - } + if (page_has_buffers(page)) { struct buffer_head *bh, *head; int fully_mapped = 1; @@ -90,7 +89,6 @@ static int nilfs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf) if (fully_mapped) { SetPageMappedToDisk(page); - unlock_page(page); goto mapped; } } @@ -105,16 +103,18 @@ static int nilfs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf) return VM_FAULT_SIGBUS; ret = block_page_mkwrite(vma, vmf, nilfs_get_block); - if (unlikely(ret)) { + if (ret != VM_FAULT_LOCKED) { nilfs_transaction_abort(inode-i_sb); return ret; } + nilfs_set_file_dirty(NILFS_SB(inode-i_sb), inode, +1 (PAGE_SHIFT - inode-i_blkbits)); nilfs_transaction_commit(inode-i_sb); mapped: SetPageChecked(page); wait_on_page_writeback(page); - return 0; + return VM_FAULT_LOCKED; } static const struct vm_operations_struct nilfs_file_vm_ops = { -- 1.7.3.5 -- To unsubscribe from this list: send the line unsubscribe linux-nilfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] nilfs2: fix data loss in mmap page write for hole blocks
On Mon, 28 Mar 2011 10:15:50 +0200, dexen deVries wrote: Hi, On Sunday 27 of March 2011 17:47:53 Ryusuke wrote: From the result of a function test of mmap, mmap write to shared pages turned out to be broken for hole blocks. It doesn't write out filled blocks and the data will be lost after umount. This is due to a bug that the target file is not queued for log writer when filling hole blocks. (...) recently I've experienced a few times data loss hang of several applications with ktorrent running. It seems the transfered files were mostly empty after reboot. Related to this regression fix? I don't know your applications actually do what. If you find mmap calls for file IOs in strace analysis, this issue seems the likely cause. Thanks, Ryusuke Konishi -- To unsubscribe from this list: send the line unsubscribe linux-nilfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html