Hi,
On Mon, 22 Mar 2010 15:34:46 +0900 (JST), Ryusuke Konishi wrote:
> On Mon, 22 Mar 2010 15:04:20 +0900 (JST), Ryusuke Konishi wrote:
> > Hi,
> > On Sat, 20 Mar 2010 23:04:39 +0100, Andreas Beckmann wrote:
> > > Hi,
> > > 
> > > I just tried to benchmark nilfs and then the file system and benchmark
> > > process got stuck. dmesg output is attached. The problems start with
> > > 
> > > nilfs_sufile_do_cancel_free: segment 0 must be clean
> > > nilfs_sufile_do_cancel_free: segment 1 must be clean
> > > NILFS warning (device sdb1): nilfs_clean_segments: segment construction
> > > failed. (err=-28)
> > > 
> > > I'm using
> > > 
> > > Kernel 2.6.33 (Debian 2.6.33-1~experimental.2)
> > > nilfs-tools 2.0.16 (Debian 2.0.16-1)
> > > 
> > > The processes are unkillable and the file system cannot be unmounted.
> > > The machine will be reset when I get back in physical range on Wednesday
> > > and the stuck file system will be removed. If there is anything I can do
> > > remotely to help you debug that problem before the file system is gone,
> > > let me know.
>
> > Thank you for the detail report!
> > 
> > I could reproduce the both problems (i.e. the warnings on
> > "nilfs_sufile_do_cancel_free" and the hang of cleaner process) by a
> > manual fault injection test.
> > 
> > Will look into these issues.
> > 
> > Ryusuke Konishi

I've found the cause of the hang-up problem.  The following patch would
fix it.

However, please note that the current nilfs cleaner is designed to
keep every change within ``protection period''.  If you write a
massive amount of data in a short term, nilfs still would stop with a
disk full and reject new changes until cleaner will make some free
space.

Thanks,
Ryusuke Konishi
--
From: Ryusuke Konishi <[email protected]>
Subject: [PATCH] nilfs2: fix hang-up of cleaner after log writer returned with 
error

According to the report from Andreas Beckmann (Message-ID:
<[email protected]>), nilfs in 2.6.33 kernel got stuck
after a disk full error.

This turned out to be a regression by log writer updates merged at
kernel 2.6.33.  nilfs_segctor_abort_construction, which is a cleanup
function for erroneous cases, was skipping writeback completion for
some logs.

This fixes the bug and would resolve the hang issue.

Reported-by: Andreas Beckmann <[email protected]>
Signed-off-by: Ryusuke Konishi <[email protected]>
---
 fs/nilfs2/segment.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index b622123..c161d89 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -1897,8 +1897,7 @@ static void nilfs_segctor_abort_construction(struct 
nilfs_sc_info *sci,
 
        list_splice_tail_init(&sci->sc_write_logs, &logs);
        ret = nilfs_wait_on_logs(&logs);
-       if (ret)
-               nilfs_abort_logs(&logs, NULL, sci->sc_super_root, ret);
+       nilfs_abort_logs(&logs, NULL, sci->sc_super_root, ret ? : err);
 
        list_splice_tail_init(&sci->sc_segbufs, &logs);
        nilfs_cancel_segusage(&logs, nilfs->ns_sufile);
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to