Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

2012-08-02 Thread Liu Bo
On 08/02/2012 06:30 PM, Stefan Behrens wrote:
 On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens wrote:
 On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote:
 On 08/01/2012 09:07 PM, Jan Schmidt wrote:
 On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
 On 08/01/2012 07:45 PM, Stefan Behrens wrote:
 With commit acce952b0, btrfs was changed to flag the filesystem with
 BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
 error happened like a write I/O errors of all mirrors.
 In such situations, on unmount, the superblock is written in
 btrfs_error_commit_super(). This is done with the intention to be able
 to evaluate the error flag on the next mount. A warning is printed
 in this case during the next mount and the log tree is ignored.

 The issue is that it is possible that the superblock points to a root
 that was not written (due to write I/O errors).
 The result is that the filesystem cannot be mounted. btrfsck also does
 not start and all the other btrfs-progs tools fail to start as well.
 However, mount -o recovery is working well and does the right things
 to recover the filesystem (i.e., don't use the log root, clear the
 free space cache and use the next mountable root that is stored in the
 root backup array).

 This patch removes the writing of the superblock when
 BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
 flag in the mount function.


 Yes, I have to admit that this can be a serious problem.

 But we'll need to send the error flag stored in the super block into
 disk in the future so that the next mount can find it unstable and do
 fsck by itself maybe.

 Hum, that's possible. However, I neither see

 a) a safe way to get that flag to disk

 nor

 b) a situation where this flag would help. When we abort a transaction, we 
 just
 roll everything back to the last commit, i.e. a consistent state. So if we 
 stop
 writing a potentially corrupt super block, we should be fine anyway. Or am 
 I
 missing something?


 I'm just wondering if we can roll everything back well, why do we need fsck?

 If the disks support barriers, we roll everything back very well. The
 most recent superblock on the disks always defines a consistent
 filesystem state. There are only two remaining filesystem consistency
 issues left that can cause inconsistent states, one is the one that the
 patch in this email addresses, and the second one is that the error
 result from barrier_all_devices() is ignored (which I want to change next).
 
 Hi Liu Bo,
 
 Do you have any remaining objections to that patch?
 

Hi Stefan,

Still I have another question:

Our metadata can be flushed into disk if we reach the limit, 32k, so we
can end up with updated metadata and the latest superblock if we do not
write the current super block.

Any ideas?

thanks,
liubo
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

2012-08-02 Thread Arne Jansen
On 02.08.2012 12:36, Liu Bo wrote:
 On 08/02/2012 06:30 PM, Stefan Behrens wrote:
 On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens wrote:
 On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote:
 On 08/01/2012 09:07 PM, Jan Schmidt wrote:
 On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
 On 08/01/2012 07:45 PM, Stefan Behrens wrote:
 With commit acce952b0, btrfs was changed to flag the filesystem with
 BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
 error happened like a write I/O errors of all mirrors.
 In such situations, on unmount, the superblock is written in
 btrfs_error_commit_super(). This is done with the intention to be able
 to evaluate the error flag on the next mount. A warning is printed
 in this case during the next mount and the log tree is ignored.

 The issue is that it is possible that the superblock points to a root
 that was not written (due to write I/O errors).
 The result is that the filesystem cannot be mounted. btrfsck also does
 not start and all the other btrfs-progs tools fail to start as well.
 However, mount -o recovery is working well and does the right things
 to recover the filesystem (i.e., don't use the log root, clear the
 free space cache and use the next mountable root that is stored in the
 root backup array).

 This patch removes the writing of the superblock when
 BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
 flag in the mount function.


 Yes, I have to admit that this can be a serious problem.

 But we'll need to send the error flag stored in the super block into
 disk in the future so that the next mount can find it unstable and do
 fsck by itself maybe.

 Hum, that's possible. However, I neither see

 a) a safe way to get that flag to disk

 nor

 b) a situation where this flag would help. When we abort a transaction, 
 we just
 roll everything back to the last commit, i.e. a consistent state. So if 
 we stop
 writing a potentially corrupt super block, we should be fine anyway. Or 
 am I
 missing something?


 I'm just wondering if we can roll everything back well, why do we need 
 fsck?

 If the disks support barriers, we roll everything back very well. The
 most recent superblock on the disks always defines a consistent
 filesystem state. There are only two remaining filesystem consistency
 issues left that can cause inconsistent states, one is the one that the
 patch in this email addresses, and the second one is that the error
 result from barrier_all_devices() is ignored (which I want to change next).

 Hi Liu Bo,

 Do you have any remaining objections to that patch?

 
 Hi Stefan,
 
 Still I have another question:
 
 Our metadata can be flushed into disk if we reach the limit, 32k, so we
 can end up with updated metadata and the latest superblock if we do not
 write the current super block.

The old metadata stays valid until the new superblock is written,
so no problem here, or maybe I don't understand your question :)

 
 Any ideas?
 
 thanks,
 liubo
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

2012-08-02 Thread Liu Bo
On 08/02/2012 07:18 PM, Arne Jansen wrote:
 On 02.08.2012 12:36, Liu Bo wrote:
 On 08/02/2012 06:30 PM, Stefan Behrens wrote:
 On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens wrote:
 On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote:
 On 08/01/2012 09:07 PM, Jan Schmidt wrote:
 On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
 On 08/01/2012 07:45 PM, Stefan Behrens wrote:
 With commit acce952b0, btrfs was changed to flag the filesystem with
 BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
 error happened like a write I/O errors of all mirrors.
 In such situations, on unmount, the superblock is written in
 btrfs_error_commit_super(). This is done with the intention to be able
 to evaluate the error flag on the next mount. A warning is printed
 in this case during the next mount and the log tree is ignored.

 The issue is that it is possible that the superblock points to a root
 that was not written (due to write I/O errors).
 The result is that the filesystem cannot be mounted. btrfsck also does
 not start and all the other btrfs-progs tools fail to start as well.
 However, mount -o recovery is working well and does the right things
 to recover the filesystem (i.e., don't use the log root, clear the
 free space cache and use the next mountable root that is stored in the
 root backup array).

 This patch removes the writing of the superblock when
 BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
 flag in the mount function.


 Yes, I have to admit that this can be a serious problem.

 But we'll need to send the error flag stored in the super block into
 disk in the future so that the next mount can find it unstable and do
 fsck by itself maybe.

 Hum, that's possible. However, I neither see

 a) a safe way to get that flag to disk

 nor

 b) a situation where this flag would help. When we abort a transaction, 
 we just
 roll everything back to the last commit, i.e. a consistent state. So if 
 we stop
 writing a potentially corrupt super block, we should be fine anyway. Or 
 am I
 missing something?


 I'm just wondering if we can roll everything back well, why do we need 
 fsck?

 If the disks support barriers, we roll everything back very well. The
 most recent superblock on the disks always defines a consistent
 filesystem state. There are only two remaining filesystem consistency
 issues left that can cause inconsistent states, one is the one that the
 patch in this email addresses, and the second one is that the error
 result from barrier_all_devices() is ignored (which I want to change next).

 Hi Liu Bo,

 Do you have any remaining objections to that patch?


 Hi Stefan,

 Still I have another question:

 Our metadata can be flushed into disk if we reach the limit, 32k, so we
 can end up with updated metadata and the latest superblock if we do not
 write the current super block.
 
 The old metadata stays valid until the new superblock is written,
 so no problem here, or maybe I don't understand your question :)
 

Yeah, Arne, you're right :)

But for undetected and unexpected errors as Arne had mentioned,  I want
to keep the error flag which is able to inform users that this FS is
recommended (but not must) to do fsck at least.

thanks,
liubo


 Any ideas?

 thanks,
 liubo
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

2012-08-02 Thread Arne Jansen
On 02.08.2012 13:34, Liu Bo wrote:
 On 08/02/2012 07:18 PM, Arne Jansen wrote:
 On 02.08.2012 12:36, Liu Bo wrote:
 On 08/02/2012 06:30 PM, Stefan Behrens wrote:
 On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens wrote:
 On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote:
 On 08/01/2012 09:07 PM, Jan Schmidt wrote:
 On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
 On 08/01/2012 07:45 PM, Stefan Behrens wrote:
 With commit acce952b0, btrfs was changed to flag the filesystem with
 BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
 error happened like a write I/O errors of all mirrors.
 In such situations, on unmount, the superblock is written in
 btrfs_error_commit_super(). This is done with the intention to be able
 to evaluate the error flag on the next mount. A warning is printed
 in this case during the next mount and the log tree is ignored.

 The issue is that it is possible that the superblock points to a root
 that was not written (due to write I/O errors).
 The result is that the filesystem cannot be mounted. btrfsck also does
 not start and all the other btrfs-progs tools fail to start as well.
 However, mount -o recovery is working well and does the right things
 to recover the filesystem (i.e., don't use the log root, clear the
 free space cache and use the next mountable root that is stored in the
 root backup array).

 This patch removes the writing of the superblock when
 BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
 flag in the mount function.


 Yes, I have to admit that this can be a serious problem.

 But we'll need to send the error flag stored in the super block into
 disk in the future so that the next mount can find it unstable and do
 fsck by itself maybe.

 Hum, that's possible. However, I neither see

 a) a safe way to get that flag to disk

 nor

 b) a situation where this flag would help. When we abort a transaction, 
 we just
 roll everything back to the last commit, i.e. a consistent state. So if 
 we stop
 writing a potentially corrupt super block, we should be fine anyway. Or 
 am I
 missing something?


 I'm just wondering if we can roll everything back well, why do we need 
 fsck?

 If the disks support barriers, we roll everything back very well. The
 most recent superblock on the disks always defines a consistent
 filesystem state. There are only two remaining filesystem consistency
 issues left that can cause inconsistent states, one is the one that the
 patch in this email addresses, and the second one is that the error
 result from barrier_all_devices() is ignored (which I want to change 
 next).

 Hi Liu Bo,

 Do you have any remaining objections to that patch?


 Hi Stefan,

 Still I have another question:

 Our metadata can be flushed into disk if we reach the limit, 32k, so we
 can end up with updated metadata and the latest superblock if we do not
 write the current super block.

 The old metadata stays valid until the new superblock is written,
 so no problem here, or maybe I don't understand your question :)

 
 Yeah, Arne, you're right :)
 
 But for undetected and unexpected errors as Arne had mentioned,  I want
 to keep the error flag which is able to inform users that this FS is
 recommended (but not must) to do fsck at least.

How about storing the flag in a different location than the superblock?
If the fs is in an unknown state, every write potentially makes it only
worse.

 
 thanks,
 liubo
 

 Any ideas?

 thanks,
 liubo
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

2012-08-02 Thread David Sterba
On Thu, Aug 02, 2012 at 03:46:50PM +0200, Arne Jansen wrote:
  Anyway, for now, our error flag has only been stored in memory, so what
  about just keep it until we find a graceful way?
 
 Yeah, we need this patch to restore consistency. We can define a fixed
 area on disk (e.g. behind the superblock) where we can write the flag
 to without risking the superblock.

Is it possible that sectors around the superblock are somehow damaged
and unwritable that writing to them will fail as well? Ie. some safe
distance from the superblock would be needed.

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

2012-08-02 Thread Arne Jansen
On 02.08.2012 15:57, David Sterba wrote:
 On Thu, Aug 02, 2012 at 03:46:50PM +0200, Arne Jansen wrote:
 Anyway, for now, our error flag has only been stored in memory, so what
 about just keep it until we find a graceful way?

 Yeah, we need this patch to restore consistency. We can define a fixed
 area on disk (e.g. behind the superblock) where we can write the flag
 to without risking the superblock.
 
 Is it possible that sectors around the superblock are somehow damaged
 and unwritable that writing to them will fail as well? Ie. some safe
 distance from the superblock would be needed.

In about 1MB distance you're side-by-side with the superblock, so closer
to it might even be safer ;)

 
 david

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

2012-08-02 Thread Jan Schmidt
On Thu, August 02, 2012 at 15:46 (+0200), Arne Jansen wrote:
 On 02.08.2012 13:57, Liu Bo wrote:
 Anyway, for now, our error flag has only been stored in memory, so what
 about just keep it until we find a graceful way?
 
 Yeah, we need this patch to restore consistency. We can define a fixed
 area on disk (e.g. behind the superblock) where we can write the flag
 to without risking the superblock.

At least we all agree that we need this patch, fine.

We don't yet agree that we need a place to store a please consider fsck flag.
Can I please get one concrete example in which situation we

a) do detect the user should really do a file system check AND
b) do not abort the transaction to clean the mess up?

(An example on how we could fail transaction cleanup is also accepted).

If such a situation doesn't exist, there's no need for this flag. The fact that
ext has such a flag doesn't convince me, probably because I know nothing about
ext. I can imagine that they can detect file system errors without the ability
to return to a potentially older consistent state.

Thanks,
-Jan
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

2012-08-02 Thread Jan Schmidt
[added Josef]

Josef: Please queue this patch for rc2, we finally agreed.

On Fri, August 03, 2012 at 05:20 (+0200), Liu Bo wrote:
 On 08/02/2012 10:28 PM, Jan Schmidt wrote:
 If such a situation doesn't exist, there's no need for this flag. The fact 
 that
 ext has such a flag doesn't convince me, probably because I know nothing 
 about
 ext. I can imagine that they can detect file system errors without the 
 ability
 to return to a potentially older consistent state.

 
 This error flag is also used to indicate filesystem's error state for
 transaction cleanup, so keeping it in memory is reasonable.

Granted :-)

-Jan
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] Btrfs: remove superblock writing after fatal error

2012-08-01 Thread Stefan Behrens
With commit acce952b0, btrfs was changed to flag the filesystem with
BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
error happened like a write I/O errors of all mirrors.
In such situations, on unmount, the superblock is written in
btrfs_error_commit_super(). This is done with the intention to be able
to evaluate the error flag on the next mount. A warning is printed
in this case during the next mount and the log tree is ignored.

The issue is that it is possible that the superblock points to a root
that was not written (due to write I/O errors).
The result is that the filesystem cannot be mounted. btrfsck also does
not start and all the other btrfs-progs tools fail to start as well.
However, mount -o recovery is working well and does the right things
to recover the filesystem (i.e., don't use the log root, clear the
free space cache and use the next mountable root that is stored in the
root backup array).

This patch removes the writing of the superblock when
BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
flag in the mount function.

These lines can be used to reproduce the issue (using /dev/sdm):
SCRATCH_DEV=/dev/sdm
SCRATCH_MNT=/mnt
echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup create foo
ls -alLF /dev/mapper/foo
mkfs.btrfs /dev/mapper/foo
mount /dev/mapper/foo $SCRATCH_MNT
echo bar  $SCRATCH_MNT/foo
sync
echo 0 25165824 error | dmsetup reload foo
dmsetup resume foo
ls -alF $SCRATCH_MNT
touch $SCRATCH_MNT/1
ls -alF $SCRATCH_MNT
sleep 35
echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup reload foo
dmsetup resume foo
sleep 1
umount $SCRATCH_MNT
btrfsck /dev/mapper/foo
dmsetup remove foo

Signed-off-by: Stefan Behrens sbehr...@giantdisaster.de
Signed-off-by: Jan Schmidt list.bt...@jan-o-sch.net
---
Changes v1 - v2:
- Removed one large comment entirely which was not needed anymore
  after the code changes.
- Removed a warning message which was dead code since the super block
  is not written anymore when the flag BTRFS_SUPER_FLAG_ERROR is set.
- Removed the return code of btrfs_error_commit_super() since it now
  does not return any errors anymore.

 fs/btrfs/disk-io.c | 36 
 fs/btrfs/disk-io.h |  2 +-
 2 files changed, 5 insertions(+), 33 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 502b20c..f6a1d0f 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2531,8 +2531,7 @@ retry_root_backup:
goto fail_trans_kthread;
 
/* do not make disk changes in broken FS */
-   if (btrfs_super_log_root(disk_super) != 0 
-   !(fs_info-fs_state  BTRFS_SUPER_FLAG_ERROR)) {
+   if (btrfs_super_log_root(disk_super) != 0) {
u64 bytenr = btrfs_super_log_root(disk_super);
 
if (fs_devices-rw_devices == 0) {
@@ -3192,30 +3191,14 @@ int close_ctree(struct btrfs_root *root)
/* clear out the rbtree of defraggable inodes */
btrfs_run_defrag_inodes(fs_info);
 
-   /*
-* Here come 2 situations when btrfs is broken to flip readonly:
-*
-* 1. when btrfs flips readonly somewhere else before
-* btrfs_commit_super, sb-s_flags has MS_RDONLY flag,
-* and btrfs will skip to write sb directly to keep
-* ERROR state on disk.
-*
-* 2. when btrfs flips readonly just in btrfs_commit_super,
-* and in such case, btrfs cannot write sb via btrfs_commit_super,
-* and since fs_state has been set BTRFS_SUPER_FLAG_ERROR flag,
-* btrfs will cleanup all FS resources first and write sb then.
-*/
if (!(fs_info-sb-s_flags  MS_RDONLY)) {
ret = btrfs_commit_super(root);
if (ret)
printk(KERN_ERR btrfs: commit super ret %d\n, ret);
}
 
-   if (fs_info-fs_state  BTRFS_SUPER_FLAG_ERROR) {
-   ret = btrfs_error_commit_super(root);
-   if (ret)
-   printk(KERN_ERR btrfs: commit super ret %d\n, ret);
-   }
+   if (fs_info-fs_state  BTRFS_SUPER_FLAG_ERROR)
+   btrfs_error_commit_super(root);
 
btrfs_put_block_group_cache(fs_info);
 
@@ -3437,18 +3420,11 @@ static int btrfs_check_super_valid(struct btrfs_fs_info 
*fs_info,
if (read_only)
return 0;
 
-   if (fs_info-fs_state  BTRFS_SUPER_FLAG_ERROR) {
-   printk(KERN_WARNING warning: mount fs with errors, 
-  running btrfsck is recommended\n);
-   }
-
return 0;
 }
 
-int btrfs_error_commit_super(struct btrfs_root *root)
+void btrfs_error_commit_super(struct btrfs_root *root)
 {
-   int ret;
-
mutex_lock(root-fs_info-cleaner_mutex);
btrfs_run_delayed_iputs(root);
mutex_unlock(root-fs_info-cleaner_mutex);
@@ -3458,10 +3434,6 @@ int btrfs_error_commit_super(struct btrfs_root *root)
 
/* cleanup FS via transaction */
btrfs_cleanup_transaction(root);
-
-   ret = 

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

2012-08-01 Thread Liu Bo
On 08/01/2012 07:45 PM, Stefan Behrens wrote:
 With commit acce952b0, btrfs was changed to flag the filesystem with
 BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
 error happened like a write I/O errors of all mirrors.
 In such situations, on unmount, the superblock is written in
 btrfs_error_commit_super(). This is done with the intention to be able
 to evaluate the error flag on the next mount. A warning is printed
 in this case during the next mount and the log tree is ignored.
 
 The issue is that it is possible that the superblock points to a root
 that was not written (due to write I/O errors).
 The result is that the filesystem cannot be mounted. btrfsck also does
 not start and all the other btrfs-progs tools fail to start as well.
 However, mount -o recovery is working well and does the right things
 to recover the filesystem (i.e., don't use the log root, clear the
 free space cache and use the next mountable root that is stored in the
 root backup array).
 
 This patch removes the writing of the superblock when
 BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
 flag in the mount function.
 

Yes, I have to admit that this can be a serious problem.

But we'll need to send the error flag stored in the super block into
disk in the future so that the next mount can find it unstable and do
fsck by itself maybe.

If we do not write the super any more, it cannot find the flag.

So can we find a way to make both things happy?

thanks,
liubo

 These lines can be used to reproduce the issue (using /dev/sdm):
 SCRATCH_DEV=/dev/sdm
 SCRATCH_MNT=/mnt
 echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup create foo
 ls -alLF /dev/mapper/foo
 mkfs.btrfs /dev/mapper/foo
 mount /dev/mapper/foo $SCRATCH_MNT
 echo bar  $SCRATCH_MNT/foo
 sync
 echo 0 25165824 error | dmsetup reload foo
 dmsetup resume foo
 ls -alF $SCRATCH_MNT
 touch $SCRATCH_MNT/1
 ls -alF $SCRATCH_MNT
 sleep 35
 echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup reload foo
 dmsetup resume foo
 sleep 1
 umount $SCRATCH_MNT
 btrfsck /dev/mapper/foo
 dmsetup remove foo
 
 Signed-off-by: Stefan Behrens sbehr...@giantdisaster.de
 Signed-off-by: Jan Schmidt list.bt...@jan-o-sch.net
 ---
 Changes v1 - v2:
 - Removed one large comment entirely which was not needed anymore
   after the code changes.
 - Removed a warning message which was dead code since the super block
   is not written anymore when the flag BTRFS_SUPER_FLAG_ERROR is set.
 - Removed the return code of btrfs_error_commit_super() since it now
   does not return any errors anymore.
 
  fs/btrfs/disk-io.c | 36 
  fs/btrfs/disk-io.h |  2 +-
  2 files changed, 5 insertions(+), 33 deletions(-)
 
 diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
 index 502b20c..f6a1d0f 100644
 --- a/fs/btrfs/disk-io.c
 +++ b/fs/btrfs/disk-io.c
 @@ -2531,8 +2531,7 @@ retry_root_backup:
   goto fail_trans_kthread;
  
   /* do not make disk changes in broken FS */
 - if (btrfs_super_log_root(disk_super) != 0 
 - !(fs_info-fs_state  BTRFS_SUPER_FLAG_ERROR)) {
 + if (btrfs_super_log_root(disk_super) != 0) {
   u64 bytenr = btrfs_super_log_root(disk_super);
  
   if (fs_devices-rw_devices == 0) {
 @@ -3192,30 +3191,14 @@ int close_ctree(struct btrfs_root *root)
   /* clear out the rbtree of defraggable inodes */
   btrfs_run_defrag_inodes(fs_info);
  
 - /*
 -  * Here come 2 situations when btrfs is broken to flip readonly:
 -  *
 -  * 1. when btrfs flips readonly somewhere else before
 -  * btrfs_commit_super, sb-s_flags has MS_RDONLY flag,
 -  * and btrfs will skip to write sb directly to keep
 -  * ERROR state on disk.
 -  *
 -  * 2. when btrfs flips readonly just in btrfs_commit_super,
 -  * and in such case, btrfs cannot write sb via btrfs_commit_super,
 -  * and since fs_state has been set BTRFS_SUPER_FLAG_ERROR flag,
 -  * btrfs will cleanup all FS resources first and write sb then.
 -  */
   if (!(fs_info-sb-s_flags  MS_RDONLY)) {
   ret = btrfs_commit_super(root);
   if (ret)
   printk(KERN_ERR btrfs: commit super ret %d\n, ret);
   }
  
 - if (fs_info-fs_state  BTRFS_SUPER_FLAG_ERROR) {
 - ret = btrfs_error_commit_super(root);
 - if (ret)
 - printk(KERN_ERR btrfs: commit super ret %d\n, ret);
 - }
 + if (fs_info-fs_state  BTRFS_SUPER_FLAG_ERROR)
 + btrfs_error_commit_super(root);
  
   btrfs_put_block_group_cache(fs_info);
  
 @@ -3437,18 +3420,11 @@ static int btrfs_check_super_valid(struct 
 btrfs_fs_info *fs_info,
   if (read_only)
   return 0;
  
 - if (fs_info-fs_state  BTRFS_SUPER_FLAG_ERROR) {
 - printk(KERN_WARNING warning: mount fs with errors, 
 -running btrfsck is recommended\n);
 - }
 -
   return 0;
  }
  
 -int 

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

2012-08-01 Thread Liu Bo
On 08/01/2012 09:07 PM, Jan Schmidt wrote:
 On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
 On 08/01/2012 07:45 PM, Stefan Behrens wrote:
 With commit acce952b0, btrfs was changed to flag the filesystem with
 BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
 error happened like a write I/O errors of all mirrors.
 In such situations, on unmount, the superblock is written in
 btrfs_error_commit_super(). This is done with the intention to be able
 to evaluate the error flag on the next mount. A warning is printed
 in this case during the next mount and the log tree is ignored.

 The issue is that it is possible that the superblock points to a root
 that was not written (due to write I/O errors).
 The result is that the filesystem cannot be mounted. btrfsck also does
 not start and all the other btrfs-progs tools fail to start as well.
 However, mount -o recovery is working well and does the right things
 to recover the filesystem (i.e., don't use the log root, clear the
 free space cache and use the next mountable root that is stored in the
 root backup array).

 This patch removes the writing of the superblock when
 BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
 flag in the mount function.


 Yes, I have to admit that this can be a serious problem.

 But we'll need to send the error flag stored in the super block into
 disk in the future so that the next mount can find it unstable and do
 fsck by itself maybe.
 
 Hum, that's possible. However, I neither see
 
 a) a safe way to get that flag to disk
 
 nor
 
 b) a situation where this flag would help. When we abort a transaction, we 
 just
 roll everything back to the last commit, i.e. a consistent state. So if we 
 stop
 writing a potentially corrupt super block, we should be fine anyway. Or am I
 missing something?
 

I'm just wondering if we can roll everything back well, why do we need fsck?

thanks,
liubo

 -Jan
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

2012-08-01 Thread Arne Jansen
On 01.08.2012 15:31, Liu Bo wrote:
 On 08/01/2012 09:07 PM, Jan Schmidt wrote:
 On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
 On 08/01/2012 07:45 PM, Stefan Behrens wrote:
 With commit acce952b0, btrfs was changed to flag the filesystem with
 BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
 error happened like a write I/O errors of all mirrors.
 In such situations, on unmount, the superblock is written in
 btrfs_error_commit_super(). This is done with the intention to be able
 to evaluate the error flag on the next mount. A warning is printed
 in this case during the next mount and the log tree is ignored.

 The issue is that it is possible that the superblock points to a root
 that was not written (due to write I/O errors).
 The result is that the filesystem cannot be mounted. btrfsck also does
 not start and all the other btrfs-progs tools fail to start as well.
 However, mount -o recovery is working well and does the right things
 to recover the filesystem (i.e., don't use the log root, clear the
 free space cache and use the next mountable root that is stored in the
 root backup array).

 This patch removes the writing of the superblock when
 BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
 flag in the mount function.


 Yes, I have to admit that this can be a serious problem.

 But we'll need to send the error flag stored in the super block into
 disk in the future so that the next mount can find it unstable and do
 fsck by itself maybe.

 Hum, that's possible. However, I neither see

 a) a safe way to get that flag to disk

 nor

 b) a situation where this flag would help. When we abort a transaction, we 
 just
 roll everything back to the last commit, i.e. a consistent state. So if we 
 stop
 writing a potentially corrupt super block, we should be fine anyway. Or am I
 missing something?

 
 I'm just wondering if we can roll everything back well, why do we need fsck?

Mostly for undetected errors.

 
 thanks,
 liubo
 
 -Jan

 
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html