Re: [PATCH] nilfs2: fix data loss with mmap()

2014-09-18 Thread Ryusuke Konishi
On Wed, 17 Sep 2014 21:34:39 +0900 (JST), Ryusuke Konishi wrote:
 On Wed, 17 Sep 2014 10:16:46 +0200, Andreas Rohner wrote:
 On 2014-09-16 15:57, Ryusuke Konishi wrote:
 On Tue, 16 Sep 2014 10:38:29 +0200, Andreas Rohner wrote:
 I'd appreciate your help on testing the patch for some old kernels.
 (And, please declare a Tested-by tag in the reply mail, if the test
 is ok).

 Sure I have everything set up. Which kernels do I have to test? Was
 commit 136e877 backported? I presume at least stable and some of the
 longterm kernels on https://www.kernel.org/...
 
 The commit 136e877 was merged to v3.10 and backported to stable trees
 of earlier kernels.  But, most of earlier stable trees are no longer
 maintained.  Well maintained trees are the following longterm kernels:
 
 - 3.4.y  (backported commit 136e877)
 - 3.10.y
 - 3.14.y
 
 I think these three kernels are worty to be tested.
 
 I tested it on all stable kernels including 3.4.x, 3.10.x, 3.14.x. The
 bug is present in all of them and the patch fixes it. The patch also
 applies cleanly on all kernels. I sent it again yesterday, and added the
 Tested-by: tag.

One thing I have a question.

Is the original issue that commit 136e877 fixed still OK ?  If you
haven't tested it, I apprecicate if you examine the test for the prior
issue.

Thanks,
Ryusuke Konishi
--
To unsubscribe from this list: send the line unsubscribe linux-nilfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] nilfs2: fix data loss with mmap()

2014-09-18 Thread Andreas Rohner
On 2014-09-18 09:24, Ryusuke Konishi wrote:
 On Wed, 17 Sep 2014 21:34:39 +0900 (JST), Ryusuke Konishi wrote:
 On Wed, 17 Sep 2014 10:16:46 +0200, Andreas Rohner wrote:
 On 2014-09-16 15:57, Ryusuke Konishi wrote:
 On Tue, 16 Sep 2014 10:38:29 +0200, Andreas Rohner wrote:
 I'd appreciate your help on testing the patch for some old kernels.
 (And, please declare a Tested-by tag in the reply mail, if the test
 is ok).

 Sure I have everything set up. Which kernels do I have to test? Was
 commit 136e877 backported? I presume at least stable and some of the
 longterm kernels on https://www.kernel.org/...

 The commit 136e877 was merged to v3.10 and backported to stable trees
 of earlier kernels.  But, most of earlier stable trees are no longer
 maintained.  Well maintained trees are the following longterm kernels:

 - 3.4.y  (backported commit 136e877)
 - 3.10.y
 - 3.14.y

 I think these three kernels are worty to be tested.

 I tested it on all stable kernels including 3.4.x, 3.10.x, 3.14.x. The
 bug is present in all of them and the patch fixes it. The patch also
 applies cleanly on all kernels. I sent it again yesterday, and added the
 Tested-by: tag.
 
 One thing I have a question.
 
 Is the original issue that commit 136e877 fixed still OK ?  If you
 haven't tested it, I apprecicate if you examine the test for the prior
 issue.

Yes the original issue is still fixed. I was able to reproduce it by
reverting  nilfs_set_page_dirty() to the state prior to commit 136e877.
Then I used the new version of nilfs_set_page_dirty() including my patch
and I couldn't reproduce the issue anymore. So it seems to still be fixed.

Furthermore the original issue was caused by the use of
__set_page_dirty_buffers(), which marked unmapped buffers as dirty. My
patch does not change the fix for that.

br,
Andreas Rohner
--
To unsubscribe from this list: send the line unsubscribe linux-nilfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] nilfs2: fix data loss with mmap()

2014-09-18 Thread Ryusuke Konishi
From: Andreas Rohner andreas.roh...@gmx.net

This bug leads to reproducible silent data loss, despite the use of
msync(), sync() and a clean unmount of the file system. It is easily
reproducible with the following script:

[BEGIN SCRIPT]
mkfs.nilfs2 -f /dev/sdb
mount /dev/sdb /mnt

dd if=/dev/zero bs=1M count=30 of=/mnt/testfile

umount /mnt
mount /dev/sdb /mnt
CHECKSUM_BEFORE=$(md5sum /mnt/testfile)

/root/mmaptest/mmaptest /mnt/testfile 30 10 5

sync
CHECKSUM_AFTER=$(md5sum /mnt/testfile)
umount /mnt
mount /dev/sdb /mnt
CHECKSUM_AFTER_REMOUNT=$(md5sum /mnt/testfile)
umount /mnt

echo BEFORE MMAP:\t$CHECKSUM_BEFORE
echo AFTER MMAP:\t$CHECKSUM_AFTER
echo AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT
[END SCRIPT]

The mmaptest tool looks something like this (very simplified, with
error checking removed):

[BEGIN mmaptest]
data = mmap(NULL, file_size - file_offset, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, file_offset);

for (i = 0; i  write_count; ++i) {
memcpy(data + i * 4096, buf, sizeof(buf));
msync(data, file_size - file_offset, MS_SYNC))
}
[END mmaptest]

The output of the script looks something like this:

BEFORE MMAP:281ed1d5ae50e8419f9b978aab16de83  /mnt/testfile
AFTER MMAP: 6604a1c31f10780331a6850371b3a313  /mnt/testfile
AFTER REMOUNT:  281ed1d5ae50e8419f9b978aab16de83  /mnt/testfile

So it is clear, that the changes done using mmap() do not survive a
remount. This can be reproduced a 100% of the time. The problem was
introduced with the following commit:

136e877 nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF
boundary

If the page was read with mpage_readpage() or mpage_readpages() for
example, then  it has no buffers attached to it. In that case
page_has_buffers(page) in nilfs_set_page_dirty() will be false.
Therefore nilfs_set_file_dirty() is never called and the pages are never
collected and never written to disk.

This patch fixes the problem by also calling nilfs_set_file_dirty() if
the page has no buffers attached to it.

Signed-off-by: Andreas Rohner andreas.roh...@gmx.net
Tested-by: Andreas Rohner andreas.roh...@gmx.net
Signed-off-by: Ryusuke Konishi konishi.ryus...@lab.ntt.co.jp
Cc: sta...@vger.kernel.org
---
 fs/nilfs2/inode.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c
index 6252b17..9e3c525 100644
--- a/fs/nilfs2/inode.c
+++ b/fs/nilfs2/inode.c
@@ -219,10 +219,10 @@ static int nilfs_writepage(struct page *page, struct 
writeback_control *wbc)
 
 static int nilfs_set_page_dirty(struct page *page)
 {
+   struct inode *inode = page-mapping-host;
int ret = __set_page_dirty_nobuffers(page);
 
if (page_has_buffers(page)) {
-   struct inode *inode = page-mapping-host;
unsigned nr_dirty = 0;
struct buffer_head *bh, *head;
 
@@ -245,6 +245,10 @@ static int nilfs_set_page_dirty(struct page *page)
 
if (nr_dirty)
nilfs_set_file_dirty(inode, nr_dirty);
+   } else if (ret) {
+   unsigned nr_dirty = 1  (PAGE_SHIFT - inode-i_blkbits);
+
+   nilfs_set_file_dirty(inode, nr_dirty);
}
return ret;
 }
-- 
1.7.9.4

--
To unsubscribe from this list: send the line unsubscribe linux-nilfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] nilfs2: fix data loss with mmap()

2014-09-17 Thread Andreas Rohner
On 2014-09-16 15:57, Ryusuke Konishi wrote:
 On Tue, 16 Sep 2014 10:38:29 +0200, Andreas Rohner wrote:
 I'd appreciate your help on testing the patch for some old kernels.
 (And, please declare a Tested-by tag in the reply mail, if the test
 is ok).

 Sure I have everything set up. Which kernels do I have to test? Was
 commit 136e877 backported? I presume at least stable and some of the
 longterm kernels on https://www.kernel.org/...
 
 The commit 136e877 was merged to v3.10 and backported to stable trees
 of earlier kernels.  But, most of earlier stable trees are no longer
 maintained.  Well maintained trees are the following longterm kernels:
 
 - 3.4.y  (backported commit 136e877)
 - 3.10.y
 - 3.14.y
 
 I think these three kernels are worty to be tested.

I tested it on all stable kernels including 3.4.x, 3.10.x, 3.14.x. The
bug is present in all of them and the patch fixes it. The patch also
applies cleanly on all kernels. I sent it again yesterday, and added the
Tested-by: tag.

br,
Andreas Rohner
--
To unsubscribe from this list: send the line unsubscribe linux-nilfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] nilfs2: fix data loss with mmap()

2014-09-16 Thread Andreas Rohner
On 2014-09-16 06:42, Ryusuke Konishi wrote:
 On Tue, 16 Sep 2014 00:24:05 +0200, Andreas Rohner wrote:
 On 2014-09-16 00:01, Ryusuke Konishi wrote:
 Hi Andreas,
 On Mon, 15 Sep 2014 21:47:30 +0200, Andreas Rohner wrote:
 This bug leads to reproducible silent data loss, despite the use of
 msync(), sync() and a clean unmount of the file system. It is easily
 reproducible with the following script:
 snip
 Thank you for reporting this issue.

 I just stumbled upon the weird behaviour of mmap() while testing the
 nilfs_sync_fs() patch.

 I'd like to look into this patch, it looks to point out an important
 regression, but it may take some time since I am quite busy this week..

 Of course. I understand.
 
 The patch looks correct.  It is my mistake that the commit 136e877
 leaked consideration for the case where the page doesn't have buffer
 heads.  This fix should be backported to stable kernels. (I'll add a
 Cc: stable tag when sending this to Andrew.)
 
 Did you confirm that the patch works as expected ?

Yes at least with the current master kernel:

BEFORE MMAP:281ed1d5ae50e8419f9b978aab16de83  /mnt/testfile
AFTER MMAP: 3d9183f1c471b9baff15c9cc8d12c303  /mnt/testfile
AFTER REMOUNT:  3d9183f1c471b9baff15c9cc8d12c303  /mnt/testfile

For the record here is the little tool I used for testing mmap:

https://github.com/zeitgeist87/mmaptest

It writes pages with random bytes at a certain offset using mmap. The
frequent umounts and mounts in the test script are necessary to purge
the page cache. There is probably a better way of doing that.

 I'd appreciate your help on testing the patch for some old kernels.
 (And, please declare a Tested-by tag in the reply mail, if the test
 is ok).

Sure I have everything set up. Which kernels do I have to test? Was
commit 136e877 backported? I presume at least stable and some of the
longterm kernels on https://www.kernel.org/...

By the way thanks for your continued effort and time investment in
reviewing my patches.

br,
Andreas Rohner

--
To unsubscribe from this list: send the line unsubscribe linux-nilfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] nilfs2: fix data loss with mmap()

2014-09-16 Thread Andreas Rohner
This bug leads to reproducible silent data loss, despite the use of
msync(), sync() and a clean unmount of the file system. It is easily
reproducible with the following script:

[BEGIN SCRIPT]
mkfs.nilfs2 -f /dev/sdb
mount /dev/sdb /mnt

# create 30MB testfile
dd if=/dev/zero bs=1M count=30 of=/mnt/testfile

umount /mnt
mount /dev/sdb /mnt
CHECKSUM_BEFORE=$(md5sum /mnt/testfile)

# simple tool that opens /mnt/testfile and
# writes a few blocks using mmap at a 5MB offset
/root/mmaptest/mmaptest /mnt/testfile 30 10 5

sync
CHECKSUM_AFTER=$(md5sum /mnt/testfile)
umount /mnt
mount /dev/sdb /mnt
CHECKSUM_AFTER_REMOUNT=$(md5sum /mnt/testfile)
umount /mnt

echo BEFORE MMAP:\t$CHECKSUM_BEFORE
echo AFTER MMAP:\t$CHECKSUM_AFTER
echo AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT
[END SCRIPT]

The mmaptest tool looks something like this (very simplified, with
error checking removed):

[BEGIN mmaptest]
data = mmap(NULL, file_size - file_offset, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, file_offset);

for (i = 0; i  write_count; ++i) {
memcpy(data + i * 4096, buf, sizeof(buf));
msync(data, file_size - file_offset, MS_SYNC))
}
[END mmaptest]

The output of the script looks something like this:

BEFORE MMAP:281ed1d5ae50e8419f9b978aab16de83  /mnt/testfile
AFTER MMAP: 6604a1c31f10780331a6850371b3a313  /mnt/testfile
AFTER REMOUNT:  281ed1d5ae50e8419f9b978aab16de83  /mnt/testfile

So it is clear, that the changes done using mmap() do not survive a
remount. This can be reproduced a 100% of the time. The problem was
introduced with the following commit:

136e877 nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF
boundary

If the page was read with mpage_readpage() or mpage_readpages() for
example, then  it has no buffers attached to it. In that case
page_has_buffers(page) in nilfs_set_page_dirty() will be false.
Therefore nilfs_set_file_dirty() is never called and the pages are never
collected and never written to disk.

This patch fixes the problem by also calling nilfs_set_file_dirty() if
the page has no buffers attached to it.

Signed-off-by: Andreas Rohner andreas.roh...@gmx.net
Tested-by: Andreas Rohner andreas.roh...@gmx.net
---
 fs/nilfs2/inode.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c
index 6252b17..9e3c525 100644
--- a/fs/nilfs2/inode.c
+++ b/fs/nilfs2/inode.c
@@ -219,10 +219,10 @@ static int nilfs_writepage(struct page *page, struct 
writeback_control *wbc)
 
 static int nilfs_set_page_dirty(struct page *page)
 {
+   struct inode *inode = page-mapping-host;
int ret = __set_page_dirty_nobuffers(page);
 
if (page_has_buffers(page)) {
-   struct inode *inode = page-mapping-host;
unsigned nr_dirty = 0;
struct buffer_head *bh, *head;
 
@@ -245,6 +245,10 @@ static int nilfs_set_page_dirty(struct page *page)
 
if (nr_dirty)
nilfs_set_file_dirty(inode, nr_dirty);
+   } else if (ret) {
+   unsigned nr_dirty = 1  (PAGE_SHIFT - inode-i_blkbits);
+
+   nilfs_set_file_dirty(inode, nr_dirty);
}
return ret;
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-nilfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] nilfs2: fix data loss with mmap()

2014-09-15 Thread Andreas Rohner
On 2014-09-16 00:01, Ryusuke Konishi wrote:
 Hi Andreas,
 On Mon, 15 Sep 2014 21:47:30 +0200, Andreas Rohner wrote:
 This bug leads to reproducible silent data loss, despite the use of
 msync(), sync() and a clean unmount of the file system. It is easily
 reproducible with the following script:
snip
 Thank you for reporting this issue.

I just stumbled upon the weird behaviour of mmap() while testing the
nilfs_sync_fs() patch.

 I'd like to look into this patch, it looks to point out an important
 regression, but it may take some time since I am quite busy this week..

Of course. I understand.

br,
Andreas Rohner

--
To unsubscribe from this list: send the line unsubscribe linux-nilfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] nilfs2: fix data loss in mmap page write for hole blocks

2011-04-01 Thread Ryusuke Konishi
Please consider including the attached patch to stable trees between
2.6.30 and 2.6.37.

This is the modified version of the following commit which is already
in upstream.

 commit: 34094537943113467faee98fe67c8a3d3f9a0a8b
 From: Ryusuke Konishi konishi.ryusuke at lab.ntt.co.jp
 Date: Sun, 27 Mar 2011 22:50:49 +0900
 Subject: [PATCH] nilfs2: fix data loss in mmap page write for hole blocks

This commit is applicable to 2.6.38, but is not to prior kernels as
is.  It will cause a build error against older kernels due to a change
of nilfs_set_file_dirty function.

I tested the attached patch for every kernel through from 2.6.30 to
2.6.37.  It was all right except that 3-way merge was required for
2.6.30 and 2.6.31.


Thanks,
Ryusuke Konishi

--
From: Ryusuke Konishi konishi.ryus...@lab.ntt.co.jp

nilfs2: fix data loss in mmap page write for hole blocks

From the result of a function test of mmap, mmap write to shared pages
turned out to be broken for hole blocks.  It doesn't write out filled
blocks and the data will be lost after umount.  This is due to a bug
that the target file is not queued for log writer when filling hole
blocks.

Also, nilfs_page_mkwrite function exits normal code path even after
successfully filled hole blocks due to a change of block_page_mkwrite
function; just after nilfs was merged into the mainline,
block_page_mkwrite() started to return VM_FAULT_LOCKED instead of zero
by the patch mm: close page_mkwrite races (commit:
b827e496c893de0c).  The current nilfs_page_mkwrite() is not handling
this value properly.

This corrects nilfs_page_mkwrite() and will resolve the data loss
problem in mmap write.

[A minor adjustment was applied for 2.6.37 and prior kernels]

Signed-off-by: Ryusuke Konishi konishi.ryus...@lab.ntt.co.jp
Tested-by: Ryusuke Konishi konishi.ryus...@lab.ntt.co.jp
Cc: stable sta...@kernel.org  [2.6.30-2.6.37]
---
 fs/nilfs2/file.c |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/nilfs2/file.c b/fs/nilfs2/file.c
index c9a30d7..d9d5d81 100644
--- a/fs/nilfs2/file.c
+++ b/fs/nilfs2/file.c
@@ -72,10 +72,9 @@ static int nilfs_page_mkwrite(struct vm_area_struct *vma, 
struct vm_fault *vmf)
/*
 * check to see if the page is mapped already (no holes)
 */
-   if (PageMappedToDisk(page)) {
-   unlock_page(page);
+   if (PageMappedToDisk(page))
goto mapped;
-   }
+
if (page_has_buffers(page)) {
struct buffer_head *bh, *head;
int fully_mapped = 1;
@@ -90,7 +89,6 @@ static int nilfs_page_mkwrite(struct vm_area_struct *vma, 
struct vm_fault *vmf)
 
if (fully_mapped) {
SetPageMappedToDisk(page);
-   unlock_page(page);
goto mapped;
}
}
@@ -105,16 +103,18 @@ static int nilfs_page_mkwrite(struct vm_area_struct *vma, 
struct vm_fault *vmf)
return VM_FAULT_SIGBUS;
 
ret = block_page_mkwrite(vma, vmf, nilfs_get_block);
-   if (unlikely(ret)) {
+   if (ret != VM_FAULT_LOCKED) {
nilfs_transaction_abort(inode-i_sb);
return ret;
}
+   nilfs_set_file_dirty(NILFS_SB(inode-i_sb), inode,
+1  (PAGE_SHIFT - inode-i_blkbits));
nilfs_transaction_commit(inode-i_sb);
 
  mapped:
SetPageChecked(page);
wait_on_page_writeback(page);
-   return 0;
+   return VM_FAULT_LOCKED;
 }
 
 static const struct vm_operations_struct nilfs_file_vm_ops = {
-- 
1.7.3.5

--
To unsubscribe from this list: send the line unsubscribe linux-nilfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] nilfs2: fix data loss in mmap page write for hole blocks

2011-03-28 Thread Ryusuke Konishi
On Mon, 28 Mar 2011 10:15:50 +0200, dexen deVries wrote:
 Hi,
 
 On Sunday 27 of March 2011 17:47:53 Ryusuke wrote:
  From the result of a function test of mmap, mmap write to shared pages
  turned out to be broken for hole blocks.  It doesn't write out filled
  blocks and the data will be lost after umount.  This is due to a bug
  that the target file is not queued for log writer when filling hole
  blocks. (...)
 
 recently I've experienced a few times data loss  hang of several 
 applications 
 with ktorrent running. It seems the transfered files were mostly empty after 
 reboot. Related to this regression  fix?

I don't know your applications actually do what.  If you find mmap
calls for file IOs in strace analysis, this issue seems the likely
cause.

Thanks,
Ryusuke Konishi
--
To unsubscribe from this list: send the line unsubscribe linux-nilfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html