Re: Re: What is the vision for btrfs fs repair?

2014-11-17 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/11/2014 3:29 AM, Goffredo Baroncelli wrote:
 On 10/10/2014 12:53 PM, Bob Marley wrote:
 
 If true, maybe the closest indication we'd get of btrfs
 stablity is the default enabling of autorecovery.
 
 No way! I wouldn't want a default like that.
 
 If you think at distributed transactions: suppose a sync was
 issued on both sides of a distributed transaction, then power was
 lost on one side, than btrfs had corruption. When I remount it,
 definitely the worst thing that can happen is that it
 auto-rolls-back to a previous known-good state.
 
 I cannot agree. I consider a sane default to have a consistent
 state with the recently data written lost, instead of require
 the user intervention to not lost anything.
 
 To address your requirement, we need a super sync command which 
 ensure that the data are in the filesystem and not only in the log
 (as sync should ensure).

I have to agree.  There is a reason we have fsck -p and why that is what
is run at boot time.  Some repairs involve a tradeoff that will result
in permanent data loss that maybe could be avoided by going the other
way, or performing manual recovery.  Such repairs should never be done
automatically by default.

For that matter I'm not even sure this sort of thing should be there as
a mount option at all.  It really should require a manual fsck run with
a big warning that *THIS WILL THROW OUT SOME DATA*.

Now if the data is saved to a snapshot or something so you can manually
try to recover it later rather than being thrown out wholesale, I can
see that being done automatically at boot time.  Of course, if btrfs is
that damaged then wouldn't grub be unable to load your kernel in the
first place?

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (MingW32)

iQEcBAEBAgAGBQJUamDQAAoJEI5FoCIzSKrwaYAIAKXgkGBbBZj6yUuLC1+euim6
6Xqer1DiGywEiO4UPaxmq3rHDOlZlyIamDpUi7nIvbfK+TgBWfEVtLvdd6shjfqA
FvFv7t+X2mlAyk+iGffSK1w9/qgEhE55M35exba95Cdsn0ezos4LpvTsL1128nkx
uGzYQcoYj1irkmDp133JuHYAxhrAp0Q6PB+5gIgWfRsVbGezcxg5FvqzotEq1J/d
7MT1FvdoUo5qt2j/KzTUfD5AlFhsXE5beykakMdFmoHlTCQAxEeUU21z6APclkxF
/b/ppLt603Vpb6rpKvNUyBy1TuPr6FJEx5O2qWUWlhRxkOUB98M86KHyWVBHtMM=
=uG+h
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-13 Thread Austin S Hemmelgarn

On 2014-10-10 18:05, Eric Sandeen wrote:

On 10/10/14 2:35 PM, Austin S Hemmelgarn wrote:

On 2014-10-10 13:43, Bob Marley wrote:

On 10/10/2014 16:37, Chris Murphy wrote:

The fail safe behavior is to treat the known good tree root as
the default tree root, and bypass the bad tree root if it cannot
be repaired, so that the volume can be mounted with default mount
options (i.e. the ones in fstab). Otherwise it's a filesystem
that isn't well suited for general purpose use as rootfs let
alone for boot.



A filesystem which is suited for general purpose use is a
filesystem which honors fsync, and doesn't *ever* auto-roll-back
without user intervention.

Anything different is not suited for database transactions at all.
Any paid service which has the users database on btrfs is going to
be at risk of losing payments, and probably without the company
even knowing. If btrfs goes this way I hope a big warning is
written on the wiki and on the manpages telling that this
filesystem is totally unsuitable for hosting databases performing
transactions.

If they need reliability, they should have some form of redundancy
in-place and/or run the database directly on the block device;
because even ext4, XFS, and pretty much every other filesystem can
lose data sometimes,


Not if i.e. fsync returns.  If the data is gone later, it's a hardware
problem, or occasionally a bug - bugs that are usually found  fixed
pretty quickly.

Yes, barring bugs and hardware problems they won't lose data.



the difference being that those tend to give
worse results when hardware is misbehaving than BTRFS does, because
BTRFS usually has a old copy of whatever data structure gets
corrupted to fall back on.


I'm curious, is that based on conjecture or real-world testing?

I wouldn't really call it testing, but based on personal experience I 
know that ext4 can lose whole directory sub-trees if it gets a single 
corrupt sector in the wrong place.  I've also had that happen on FAT32 
and (somewhat interestingly) HFS+ with failing/misbehaving hardware; and 
I've actually had individual files disappear on HFS+ without any 
discernible hardware issues.  I don't have as much experience with XFS, 
but would assume based on what I do know of it that it could have 
similar issues.  As for BTRFS, I've only ever had any issues with it 3 
times, one was due to the kernel panicking during resume from S1, and 
the other two were due to hardware problems that would have caused 
issues on most other filesystems as well.  In both cases of hardware 
issues, while the filesystem was initially unmountable, it was 
relatively simple to fix once I knew how.  I tried to fix an ext4 fs 
that had become unmountable due to dropped writes once, and that was 
anything but simple, even with the much greater amount of documentation.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: What is the vision for btrfs fs repair?

2014-10-13 Thread Austin S Hemmelgarn

On 2014-10-12 06:14, Martin Steigerwald wrote:

Am Freitag, 10. Oktober 2014, 10:37:44 schrieb Chris Murphy:

On Oct 10, 2014, at 6:53 AM, Bob Marley bobmar...@shiftmail.org wrote:

On 10/10/2014 03:58, Chris Murphy wrote:

* mount -o recovery

Enable autorecovery attempts if a bad tree root is found at mount
time.


I'm confused why it's not the default yet. Maybe it's continuing to
evolve at a pace that suggests something could sneak in that makes
things worse? It is almost an oxymoron in that I'm manually enabling an
autorecovery

If true, maybe the closest indication we'd get of btrfs stablity is the
default enabling of autorecovery.

No way!
I wouldn't want a default like that.

If you think at distributed transactions: suppose a sync was issued on
both sides of a distributed transaction, then power was lost on one side,
than btrfs had corruption. When I remount it, definitely the worst thing
that can happen is that it auto-rolls-back to a previous known-good
state.

For a general purpose file system, losing 30 seconds (or less) of
questionably committed data, likely corrupt, is a file system that won't
mount without user intervention, which requires a secret decoder ring to
get it to mount at all. And may require the use of specialized tools to
retrieve that data in any case.

The fail safe behavior is to treat the known good tree root as the default
tree root, and bypass the bad tree root if it cannot be repaired, so that
the volume can be mounted with default mount options (i.e. the ones in
fstab). Otherwise it's a filesystem that isn't well suited for general
purpose use as rootfs let alone for boot.


To understand this a bit better:

What can be the reasons a recent tree gets corrupted?


Well, so far I have had the following cause corrupted trees:
1. Kernel panic during resume from ACPI S1 (suspend to RAM), which just 
happened to be in the middle of a tree commit.

2. Generic power loss during a tree commit.
3. A device not properly honoring write-barriers (the operations 
immediately adjacent to the write barrier weren't being ordered 
correctly all the time).


Based on what I know about BTRFS, the following could also cause problems:
1. A single-event-upset somewhere in the write path.
2. The kernel issuing a write to the wrong device (I haven't had this 
happen to me, but know people who have).


In general, any of these will cause problems for pretty much any 
filesystem, not just BTRFS.

I always thought with a controller and device and driver combination that
honors fsync with BTRFS it would either be the new state of the last known
good state *anyway*. So where does the need to rollback arise from?

I think that in this case the term rollback is a bit ambiguous, here it 
means from the point of view of userspace, which sees the FS as having 
'rolled-back' from the most recent state to the last known good state.

That said all journalling filesystems have some sort of rollback as far as I
understand: If the last journal entry is incomplete they discard it on journal
replay. So even there you use the last seconds of write activity.

But in case fsync() returns the data needs to be safe on disk. I always
thought BTRFS honors this under *any* circumstance. If some proposed
autorollback breaks this guarentee, I think something is broke elsewhere.

And fsync is an fsync is an fsync. Its semantics are clear as crystal. There
is nothing, absolutely nothing to discuss about it.

An fsync completes if the device itself reported Yeah, I have the data on
disk, all safe and cool to go. Anything else is a bug IMO.

Or a hardware issue, most filesystems need disks to properly honor write 
barriers to provide guaranteed semantics on an fsync, and many consumer 
disk drives still don't honor them consistently.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: What is the vision for btrfs fs repair?

2014-10-13 Thread Rich Freeman
On Sun, Oct 12, 2014 at 6:14 AM, Martin Steigerwald mar...@lichtvoll.de wrote:
 Am Freitag, 10. Oktober 2014, 10:37:44 schrieb Chris Murphy:
 On Oct 10, 2014, at 6:53 AM, Bob Marley bobmar...@shiftmail.org wrote:
  On 10/10/2014 03:58, Chris Murphy wrote:
  * mount -o recovery
 
Enable autorecovery attempts if a bad tree root is found at mount
time.
 
  I'm confused why it's not the default yet. Maybe it's continuing to
  evolve at a pace that suggests something could sneak in that makes
  things worse? It is almost an oxymoron in that I'm manually enabling an
  autorecovery
 
  If true, maybe the closest indication we'd get of btrfs stablity is the
  default enabling of autorecovery.
  No way!
  I wouldn't want a default like that.
 
  If you think at distributed transactions: suppose a sync was issued on
  both sides of a distributed transaction, then power was lost on one side,
  than btrfs had corruption. When I remount it, definitely the worst thing
  that can happen is that it auto-rolls-back to a previous known-good
  state.
 For a general purpose file system, losing 30 seconds (or less) of
 questionably committed data, likely corrupt, is a file system that won't
 mount without user intervention, which requires a secret decoder ring to
 get it to mount at all. And may require the use of specialized tools to
 retrieve that data in any case.

 The fail safe behavior is to treat the known good tree root as the default
 tree root, and bypass the bad tree root if it cannot be repaired, so that
 the volume can be mounted with default mount options (i.e. the ones in
 fstab). Otherwise it's a filesystem that isn't well suited for general
 purpose use as rootfs let alone for boot.

 To understand this a bit better:

 What can be the reasons a recent tree gets corrupted?

 I always thought with a controller and device and driver combination that
 honors fsync with BTRFS it would either be the new state of the last known
 good state *anyway*. So where does the need to rollback arise from?


In theory the recover option should never be necessary.  Btrfs makes
all the guarantees everybody wants it to - when the data is fsynced
then it will never be lost.

The question is what should happen when a corrupted tree root, which
should never happen, happens anyway.  The options are to refuse to
mount the filesystem by default, or mount it by default discarding
about 30-60s worth of writes.  And yes, when this situation happens
(whether it mounts by default or not) btrfs has broken its promise of
data being written after a successful fsync return.

As has been pointed out, braindead drive firmware is the most likely
cause of this sort of issue.  However, there are a number of other
hardware and software errors that could cause it, including errors in
linux outside of btrfs, and of course bugs in btrfs as well.

In an ideal world no filesystem would need any kind of recovery/repair
tools.  They can often mean that the fsync promise was broken.  The
real question is, once that has happened, how do you move on?

I think the best default is to auto-recover, but to have better
facilities for reporting errors to the user.  Right now btrfs is very
quiet about failures - maybe a cryptic message in dmesg, and nobody
reads all of that unless they're looking for something.  If btrfs
could report significant issues that might mitigate the impact of an
auto-recovery.

Also, another thing to consider during recovery is whether the damaged
data could be optionally stored in a snapshot of some kind - maybe in
the way that ext3/4 rollback data after conversion gets stored in a
snapshot.  My knowledge of the underlying structures is weak, but I'd
think that a corrupted tree root practically is a snapshot already,
and turning it into one might even be easier than cleaning it up.  Of
course, we would need to ensure the snapshot could be deleted without
further error.  Doing anything with the snapshot might require special
tools, but if people want to do disk scraping they could.

--
Rich
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-13 Thread Josef Bacik

On 10/08/2014 03:11 PM, Eric Sandeen wrote:

I was looking at Marc's post:

https://urldefense.proofpoint.com/v1/url?u=http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.htmlk=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0Am=XJPoqgf9jjvuE1IqCerEXXuwF4w3hbDS3%2F63x5KI4R4%3D%0As=b1f817d758eacf914bd60f20ada715384e13c1f8e040100794b0cb21261ec884

and it feels like there isn't exactly a cohesive, overarching vision for
repair of a corrupted btrfs filesystem.

In other words - I'm an admin cruising along, when the kernel throws some
fs corruption error, or for whatever reason btrfs fails to mount.
What should I do?

Marc lays out several steps, but to me this highlights that there seem to
be a lot of disjoint mechanisms out there to deal with these problems;
mostly from Marc's blog, with some bits of my own:

* btrfs scrub
Errors are corrected along if possible (what *is* possible?)
* mount -o recovery
Enable autorecovery attempts if a bad tree root is found at mount 
time.
* mount -o degraded
Allow mounts to continue with missing devices.
(This isn't really a way to recover from corruption, right?)
* btrfs-zero-log
remove the log tree if log tree is corrupt
* btrfs rescue
Recover a damaged btrfs filesystem
chunk-recover
super-recover
How does this relate to btrfs check?
* btrfs check
repair a btrfs filesystem
--repair
--init-csum-tree
--init-extent-tree
How does this relate to btrfs rescue?
* btrfs restore
try to salvage files from a damaged filesystem
(not really repair, it's disk-scraping)


What's the vision for, say, scrub vs. check vs. rescue?  Should they repair the
same errors, only online vs. offline?  If not, what class of errors does one 
fix vs.
the other?  How would an admin know?  Can btrfs check recover a bad tree root
in the same way that mount -o recovery does?  How would I know if I should use
--init-*-tree, or chunk-recover, and what are the ramifications of using
these options?

It feels like recovery tools have been badly splintered, and if there's an
overarching design or vision for btrfs fs repair, I can't tell what it is.
Can anyone help me?



We probably should just consolidate under 3 commands, one for online 
checking, one for offline repair and one for pulling stuff off of the 
disk when things go to hell.  A lot of these tools were born out of the 
fact that we didn't have a fsck tool for a long time so there were these 
stop gaps put into place, so now its time to go back and clean it up.


I'll try and do this after I finish my cleanup/sync between kernel and 
progs work and fill out the documentation a little better so its clear 
when to use what.  Thanks,


Josef

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-12 Thread Martin Steigerwald
Am Donnerstag, 9. Oktober 2014, 21:58:53 schrieben Sie:
  * btrfs-zero-log
remove the log tree if log tree is corrupt
  * btrfs rescue
Recover a damaged btrfs filesystem
chunk-recover
super-recover
How does this relate to btrfs check?
  * btrfs check
repair a btrfs filesystem
--repair
--init-csum-tree
--init-extent-tree
How does this relate to btrfs rescue?
 
 These three translate into eight combinations of repairs, adding -o recovery
 there are 9 combinations. I think this is the main source of confusion,
 there are just too many options, but also it's completely non-obvious which
 one to use in which situation.
 
 My expectation is that eventually these get consolidated into just check and
 check --repair. As the repair code matures, it'd go into kernel
 autorecovery code. That's a guess on my part, but it's consistent with
 design goals.

Also I think these should at least all be unter the btrfs command.

So include btrfs-zero-log in btrfs command.

And well how about btrfs repair or btrfs check as upper category and at 
least add the various options as commands below it? So there is at least one
command and one place in manpage to learn about the various options.

But maybe some can be made automatic as well. Or folded into btrfs check --
repair. Ideally it would auto-detect which path to take on filesystem 
recovery.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-12 Thread Martin Steigerwald
Am Freitag, 10. Oktober 2014, 10:37:44 schrieb Chris Murphy:
 On Oct 10, 2014, at 6:53 AM, Bob Marley bobmar...@shiftmail.org wrote:
  On 10/10/2014 03:58, Chris Murphy wrote:
  * mount -o recovery
  
Enable autorecovery attempts if a bad tree root is found at mount
time.
  
  I'm confused why it's not the default yet. Maybe it's continuing to
  evolve at a pace that suggests something could sneak in that makes
  things worse? It is almost an oxymoron in that I'm manually enabling an
  autorecovery
  
  If true, maybe the closest indication we'd get of btrfs stablity is the
  default enabling of autorecovery. 
  No way!
  I wouldn't want a default like that.
  
  If you think at distributed transactions: suppose a sync was issued on
  both sides of a distributed transaction, then power was lost on one side,
  than btrfs had corruption. When I remount it, definitely the worst thing
  that can happen is that it auto-rolls-back to a previous known-good
  state.
 For a general purpose file system, losing 30 seconds (or less) of
 questionably committed data, likely corrupt, is a file system that won't
 mount without user intervention, which requires a secret decoder ring to
 get it to mount at all. And may require the use of specialized tools to
 retrieve that data in any case.
 
 The fail safe behavior is to treat the known good tree root as the default
 tree root, and bypass the bad tree root if it cannot be repaired, so that
 the volume can be mounted with default mount options (i.e. the ones in
 fstab). Otherwise it's a filesystem that isn't well suited for general
 purpose use as rootfs let alone for boot.

To understand this a bit better:

What can be the reasons a recent tree gets corrupted?

I always thought with a controller and device and driver combination that 
honors fsync with BTRFS it would either be the new state of the last known 
good state *anyway*. So where does the need to rollback arise from?

That said all journalling filesystems have some sort of rollback as far as I 
understand: If the last journal entry is incomplete they discard it on journal 
replay. So even there you use the last seconds of write activity.

But in case fsync() returns the data needs to be safe on disk. I always 
thought BTRFS honors this under *any* circumstance. If some proposed 
autorollback breaks this guarentee, I think something is broke elsewhere.

And fsync is an fsync is an fsync. Its semantics are clear as crystal. There 
is nothing, absolutely nothing to discuss about it.

An fsync completes if the device itself reported Yeah, I have the data on 
disk, all safe and cool to go. Anything else is a bug IMO.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-12 Thread Martin Steigerwald
Am Mittwoch, 8. Oktober 2014, 14:11:51 schrieb Eric Sandeen:
 I was looking at Marc's post:
 
 http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub- 
 and-Btrfs-Filesystem-Repair.html
 
 and it feels like there isn't exactly a cohesive, overarching vision for
 repair of a corrupted btrfs filesystem.
 
 In other words - I'm an admin cruising along, when the kernel throws some
 fs corruption error, or for whatever reason btrfs fails to mount.
 What should I do?
 
 Marc lays out several steps, but to me this highlights that there seem to
 be a lot of disjoint mechanisms out there to deal with these problems;
 mostly from Marc's blog, with some bits of my own:
 
 * btrfs scrub
   Errors are corrected along if possible (what *is* possible?)
 * mount -o recovery
   Enable autorecovery attempts if a bad tree root is found at mount 
 time.
 * mount -o degraded
   Allow mounts to continue with missing devices.
   (This isn't really a way to recover from corruption, right?)
 * btrfs-zero-log
   remove the log tree if log tree is corrupt
 * btrfs rescue
   Recover a damaged btrfs filesystem
   chunk-recover
   super-recover
   How does this relate to btrfs check?
 * btrfs check
   repair a btrfs filesystem
   --repair
   --init-csum-tree
   --init-extent-tree
   How does this relate to btrfs rescue?
 * btrfs restore
   try to salvage files from a damaged filesystem
   (not really repair, it's disk-scraping)
 
 
 What's the vision for, say, scrub vs. check vs. rescue?  Should they repair
 the same errors, only online vs. offline?  If not, what class of errors
 does one fix vs. the other?  How would an admin know?  Can btrfs check
 recover a bad tree root in the same way that mount -o recovery does?  How
 would I know if I should use --init-*-tree, or chunk-recover, and what are
 the ramifications of using these options?
 
 It feels like recovery tools have been badly splintered, and if there's an
 overarching design or vision for btrfs fs repair, I can't tell what it is.
 Can anyone help me?

How about taking one step back:

What are the possible corruption cases these tools are meant to address? 
*Where* can BTRFS break and *why*?

What of it can be folded into one command? Where can BTRFS be improved to 
either prevent a corruption from happening ot automatically correcting it? 
What actions can be determined automatically by the repair tool? What needs to 
be options for the user to choose from? And what guidance would the user need 
to decide?

I.e. really going to back what diagnosing and repair of BTRFS actually 
includes and then well… go about a vision how this all can fit together as you 
suggested.

As a minimum I suggest to have all possible options as a main category in 
btrfs command, no external commands whatsoever, so if btrfs-zero-log is still 
needed, at it into btrfs command.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-12 Thread Duncan
Martin Steigerwald posted on Sun, 12 Oct 2014 12:14:01 +0200 as excerpted:

 I always thought with a controller and device and driver combination
 that honors fsync with BTRFS it would either be the new state of the
 last known good state *anyway*. So where does the need to rollback arise
 from?

My understanding here is...

With btrfs a full-tree commit is atomic.  You should get either the old 
tree or the new tree.  However, due to the cascading nature of updates on 
cow-based structures, these full-tree commits are done by default 
(there's a mount-option to adjust it) every 30 seconds.  Between these 
atomic commits partial updates may have occurred.  The btrfs log (the one 
that btrfs-zero-log kills) is limited to between-commit updates, and thus 
to the upto 30 seconds (default) worth of changes since the last full-
tree atomic commit.

In addition to that, there's a history of tree-root commits kept (with 
the superblocks pointing to the last one).  Btrfs-find-tree-root can be 
used to list this history.  The recovery mount option simply allows btrfs 
to fall back to this history, should the current root be corrupted.  
Btrfs restore can be used to list tree roots as well, and can be pointed 
at an appropriate one if necessary.

Fsync forces the file and its corresponding metadata update to the log 
and barring hardware or software bugs should not return until it's safely 
in the log, but I'm not sure whether it forces a full-tree commit.  
Either way the guarantees should be the same.  If the log can be replayed 
or a full-tree commit has occurred since the fsync, the new copy should 
appear.  If it can't, the rollback to the last atomic tree commit should 
return an intact copy of the file from that point.  If the recovery mount 
option is used and a further rollback to an earlier full-tree commit is 
forced, provided it existed at the point of that full-tree commit, the 
intact file at that point should appear.

So if the current tree root is a good one, the log will replay the last 
upto 30 seconds of activity on top of that last atomic tree root.  If the 
current root tree itself is corrupt, the recovery mount option will let 
an earlier one be used.  Obviously in that case the log will be discarded 
since it applies to a later root tree that itself has been discarded.

The debate is whether recovery should be automated so the admin doesn't 
have to care about it, or whether having to manually add that option 
serves as a necessary notifier to the admin that something /did/ go 
wrong, and that an earlier root is being used instead, so more than a few 
seconds worth of data may have disappeared.


As someone else has already suggested, I'd argue that as long as btrfs 
continues to be under the sort of development it's in now, keeping 
recovery as a non-default option is desired.  Once it's optimized and 
considered stable, arguably recovery should be made the default, perhaps 
with a no-recovery option for those who prefer that in-the-face 
notification in the form of a mount error, if btrfs would otherwise fall 
back to an earlier tree root commit.

What worries me, however, is that IMO the recent warning stripping was 
premature.  Btrfs is certainly NOT fully stable or optimized for normal 
use at this point.  We're still using the even/odd PID balancing scheme 
for raid1 reads, for instance, and multi-device writes are still 
serialized when they could be parallelized to a much larger degree (tho 
keeping some serialization is arguably good for data safety).  Arguably 
optimizing that now would be premature optimization since the code itself 
is still subject to change, so I'm not complaining, but by that very same 
token, it *IS* still subject to change, which by definition means it's 
*NOT* stable, so why are we removing all the warnings and giving the 
impression that it IS stable?

The decision wasn't mine to make and I don't know, but while a nice 
suggestion, making recovery-by-default a measure of when btrfs goes 
stable simply won't work, because surely, the same folks behind the 
warning stripping would then ensure this indicator too, said btrfs was 
stable, while the state of the code itself continues to say otherwise. 

Meanwhile, if your distributed transactions scenario doesn't account for 
crash and loss of data on one side with real-time backup/redundancy, such 
that loss of a few seconds worth of transactions on a single local 
filesystem is going to kill the entire scenario, I don't think too much 
of that scenario in the first place, and regardless, btrfs, certainly in 
its current state, is definitely NOT an appropriate base for it.  Use 
appropriate tools for the task.  Btrfs at least at this point is simply 
not an appropriate tool for that task.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe 

Re: What is the vision for btrfs fs repair?

2014-10-11 Thread Goffredo Baroncelli
On 10/10/2014 12:53 PM, Bob Marley wrote:
 
 If true, maybe the closest indication we'd get of btrfs stablity is
 the default enabling of autorecovery.
 
 No way! I wouldn't want a default like that.
 
 If you think at distributed transactions: suppose a sync was issued
 on both sides of a distributed transaction, then power was lost on
 one side, than btrfs had corruption. When I remount it, definitely
 the worst thing that can happen is that it auto-rolls-back to a
 previous known-good state.

I cannot agree. I consider a sane default to have a consistent state with 
the recently data written lost, instead of require the user 
intervention to not lost anything.

To address your requirement, we need a super sync command which
ensure that the data are in the filesystem and not only
in the log (as sync should ensure).

BR

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli kreijackATinwind.it
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-10 Thread Bob Marley

On 10/10/2014 03:58, Chris Murphy wrote:



* mount -o recovery
Enable autorecovery attempts if a bad tree root is found at mount 
time.

I'm confused why it's not the default yet. Maybe it's continuing to evolve at a 
pace that suggests something could sneak in that makes things worse? It is 
almost an oxymoron in that I'm manually enabling an autorecovery

If true, maybe the closest indication we'd get of btrfs stablity is the default 
enabling of autorecovery.


No way!
I wouldn't want a default like that.

If you think at distributed transactions: suppose a sync was issued on 
both sides of a distributed transaction, then power was lost on one 
side, than btrfs had corruption. When I remount it, definitely the worst 
thing that can happen is that it auto-rolls-back to a previous 
known-good state.


Now if I can express wishes:

I would like an option that spits out all the usable tree roots (or 
what's the name, superblocks?) and not just the newest one which is 
corrupt. And then another option that lets me mount *readonly* starting 
from the tree root I specify. So I can check how much of the data is 
still there. After I decide that such tree root is good, I need another 
option that lets me mount with such tree root in readwrite mode, and 
obviously eliminating all tree roots newer than that.
Some time ago I read that mounting the filesystem with an earlier tree 
root was possible, but only by manually erasing the disk regions in 
which the newer superblocks are. This is crazy, it's too risky on too 
many levels, and also as I wrote I want to check what data is available 
on a certain tree root before mounting readwrite from that one.



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-10 Thread Roman Mamedov
On Fri, 10 Oct 2014 12:53:38 +0200
Bob Marley bobmar...@shiftmail.org wrote:

 On 10/10/2014 03:58, Chris Murphy wrote:
 
  * mount -o recovery
 Enable autorecovery attempts if a bad tree root is found at mount 
  time.
  I'm confused why it's not the default yet. Maybe it's continuing to evolve 
  at a pace that suggests something could sneak in that makes things worse? 
  It is almost an oxymoron in that I'm manually enabling an autorecovery
 
  If true, maybe the closest indication we'd get of btrfs stablity is the 
  default enabling of autorecovery.
 
 No way!
 I wouldn't want a default like that.
 
 If you think at distributed transactions: suppose a sync was issued on 
 both sides of a distributed transaction, then power was lost on one 
 side

What distributed transactions? Btrfs is not a clustered filesystem[1], it does
not support and likely will never support being mounted from multiple hosts at
the same time.

[1]http://en.wikipedia.org/wiki/Clustered_file_system

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: What is the vision for btrfs fs repair?

2014-10-10 Thread Bob Marley

On 10/10/2014 12:59, Roman Mamedov wrote:

On Fri, 10 Oct 2014 12:53:38 +0200
Bob Marley bobmar...@shiftmail.org wrote:


On 10/10/2014 03:58, Chris Murphy wrote:

* mount -o recovery
Enable autorecovery attempts if a bad tree root is found at mount 
time.

I'm confused why it's not the default yet. Maybe it's continuing to evolve at a 
pace that suggests something could sneak in that makes things worse? It is 
almost an oxymoron in that I'm manually enabling an autorecovery

If true, maybe the closest indication we'd get of btrfs stablity is the default 
enabling of autorecovery.

No way!
I wouldn't want a default like that.

If you think at distributed transactions: suppose a sync was issued on
both sides of a distributed transaction, then power was lost on one
side

What distributed transactions? Btrfs is not a clustered filesystem[1], it does
not support and likely will never support being mounted from multiple hosts at
the same time.

[1]http://en.wikipedia.org/wiki/Clustered_file_system



This is not the only way to do a distributed transaction.
Databases can be hosted on the filesystem, and those can do distributed 
transations.
Think of two bank accounts, one on btrfs fs1 here, and another bank 
account on database on a whatever filesystem in another country. You 
want to debit one account and credit the other one: the filesystems at 
the two sides *must not rollback their state* !! (especially not 
transparently without human intervention)


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-10 Thread Chris Murphy

On Oct 10, 2014, at 6:53 AM, Bob Marley bobmar...@shiftmail.org wrote:

 On 10/10/2014 03:58, Chris Murphy wrote:
 
 * mount -o recovery
 Enable autorecovery attempts if a bad tree root is found at mount 
 time.
 I'm confused why it's not the default yet. Maybe it's continuing to evolve 
 at a pace that suggests something could sneak in that makes things worse? It 
 is almost an oxymoron in that I'm manually enabling an autorecovery
 
 If true, maybe the closest indication we'd get of btrfs stablity is the 
 default enabling of autorecovery.
 
 No way!
 I wouldn't want a default like that.
 
 If you think at distributed transactions: suppose a sync was issued on both 
 sides of a distributed transaction, then power was lost on one side, than 
 btrfs had corruption. When I remount it, definitely the worst thing that can 
 happen is that it auto-rolls-back to a previous known-good state.

For a general purpose file system, losing 30 seconds (or less) of questionably 
committed data, likely corrupt, is a file system that won't mount without user 
intervention, which requires a secret decoder ring to get it to mount at all. 
And may require the use of specialized tools to retrieve that data in any case.

The fail safe behavior is to treat the known good tree root as the default tree 
root, and bypass the bad tree root if it cannot be repaired, so that the volume 
can be mounted with default mount options (i.e. the ones in fstab). Otherwise 
it's a filesystem that isn't well suited for general purpose use as rootfs let 
alone for boot.

Chris Murphy

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-10 Thread cwillu
If -o recovery is necessary, then you're either running into a btrfs
bug, or your hardware is lying about when it has actually written
things to disk.

The first case isn't unheard of, although far less common than it used
to be, and it should continue to improve with time.

In the second case, you're potentially screwed regardless of the
filesystem, without doing hacks like wait a good long time before
returning from fsync in the hopes that the disk might actually have
gotten around to performing the write it said had already finished.

On Fri, Oct 10, 2014 at 5:12 AM, Bob Marley bobmar...@shiftmail.org wrote:
 On 10/10/2014 12:59, Roman Mamedov wrote:

 On Fri, 10 Oct 2014 12:53:38 +0200
 Bob Marley bobmar...@shiftmail.org wrote:

 On 10/10/2014 03:58, Chris Murphy wrote:

 * mount -o recovery
 Enable autorecovery attempts if a bad tree root is found at
 mount time.

 I'm confused why it's not the default yet. Maybe it's continuing to
 evolve at a pace that suggests something could sneak in that makes things
 worse? It is almost an oxymoron in that I'm manually enabling an
 autorecovery

 If true, maybe the closest indication we'd get of btrfs stablity is the
 default enabling of autorecovery.

 No way!
 I wouldn't want a default like that.

 If you think at distributed transactions: suppose a sync was issued on
 both sides of a distributed transaction, then power was lost on one
 side

 What distributed transactions? Btrfs is not a clustered filesystem[1], it
 does
 not support and likely will never support being mounted from multiple
 hosts at
 the same time.

 [1]http://en.wikipedia.org/wiki/Clustered_file_system


 This is not the only way to do a distributed transaction.
 Databases can be hosted on the filesystem, and those can do distributed
 transations.
 Think of two bank accounts, one on btrfs fs1 here, and another bank account
 on database on a whatever filesystem in another country. You want to debit
 one account and credit the other one: the filesystems at the two sides *must
 not rollback their state* !! (especially not transparently without human
 intervention)


 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-10 Thread Bob Marley

On 10/10/2014 16:37, Chris Murphy wrote:

The fail safe behavior is to treat the known good tree root as the default tree 
root, and bypass the bad tree root if it cannot be repaired, so that the volume 
can be mounted with default mount options (i.e. the ones in fstab). Otherwise 
it's a filesystem that isn't well suited for general purpose use as rootfs let 
alone for boot.



A filesystem which is suited for general purpose use is a filesystem 
which honors fsync, and doesn't *ever* auto-roll-back without user 
intervention.


Anything different is not suited for database transactions at all. Any 
paid service which has the users database on btrfs is going to be at 
risk of losing payments, and probably without the company even knowing. 
If btrfs goes this way I hope a big warning is written on the wiki and 
on the manpages telling that this filesystem is totally unsuitable for 
hosting databases performing transactions.


At most I can suggest that a flag in the metadata be added to 
allow/disallow auto-roll-back-on-error on such filesystem, so people can 
decide the tolerant vs. transaction-safe mode at filesystem creation.


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-10 Thread Bardur Arantsson
On 2014-10-10 19:43, Bob Marley wrote:
 On 10/10/2014 16:37, Chris Murphy wrote:
 The fail safe behavior is to treat the known good tree root as the
 default tree root, and bypass the bad tree root if it cannot be
 repaired, so that the volume can be mounted with default mount options
 (i.e. the ones in fstab). Otherwise it's a filesystem that isn't well
 suited for general purpose use as rootfs let alone for boot.

 
 A filesystem which is suited for general purpose use is a filesystem
 which honors fsync, and doesn't *ever* auto-roll-back without user
 intervention.
 

A file system cannot do anything about the *DISKS* not honouring a sync
command. That's what the PP was talking about.



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-10 Thread Austin S Hemmelgarn

On 2014-10-10 13:43, Bob Marley wrote:

On 10/10/2014 16:37, Chris Murphy wrote:

The fail safe behavior is to treat the known good tree root as the
default tree root, and bypass the bad tree root if it cannot be
repaired, so that the volume can be mounted with default mount options
(i.e. the ones in fstab). Otherwise it's a filesystem that isn't well
suited for general purpose use as rootfs let alone for boot.



A filesystem which is suited for general purpose use is a filesystem
which honors fsync, and doesn't *ever* auto-roll-back without user
intervention.

Anything different is not suited for database transactions at all. Any
paid service which has the users database on btrfs is going to be at
risk of losing payments, and probably without the company even knowing.
If btrfs goes this way I hope a big warning is written on the wiki and
on the manpages telling that this filesystem is totally unsuitable for
hosting databases performing transactions.
If they need reliability, they should have some form of redundancy 
in-place and/or run the database directly on the block device; because 
even ext4, XFS, and pretty much every other filesystem can lose data 
sometimes, the difference being that those tend to give worse results 
when hardware is misbehaving than BTRFS does, because BTRFS usually has 
a old copy of whatever data structure gets corrupted to fall back on.


Also, you really shouldn't be running databases on a BTRFS filesystem at 
the moment anyway, because of the significant performance implications.


At most I can suggest that a flag in the metadata be added to
allow/disallow auto-roll-back-on-error on such filesystem, so people can
decide the tolerant vs. transaction-safe mode at filesystem creation.



The problem with this is that if the auto-recovery code did run (and 
IMHO the kernel should spit out a warning to the system log whenever it 
does), then chances are that you wouldn't have had a consistent view if 
you had prevented it from running either; and, if the database is 
properly distributed/replicated, then it should recover by itself.





smime.p7s
Description: S/MIME Cryptographic Signature


Re: What is the vision for btrfs fs repair?

2014-10-10 Thread Eric Sandeen
On 10/10/14 2:35 PM, Austin S Hemmelgarn wrote:
 On 2014-10-10 13:43, Bob Marley wrote:
 On 10/10/2014 16:37, Chris Murphy wrote:
 The fail safe behavior is to treat the known good tree root as
 the default tree root, and bypass the bad tree root if it cannot
 be repaired, so that the volume can be mounted with default mount
 options (i.e. the ones in fstab). Otherwise it's a filesystem
 that isn't well suited for general purpose use as rootfs let
 alone for boot.
 
 
 A filesystem which is suited for general purpose use is a
 filesystem which honors fsync, and doesn't *ever* auto-roll-back
 without user intervention.
 
 Anything different is not suited for database transactions at all.
 Any paid service which has the users database on btrfs is going to
 be at risk of losing payments, and probably without the company
 even knowing. If btrfs goes this way I hope a big warning is
 written on the wiki and on the manpages telling that this
 filesystem is totally unsuitable for hosting databases performing
 transactions.
 If they need reliability, they should have some form of redundancy
 in-place and/or run the database directly on the block device;
 because even ext4, XFS, and pretty much every other filesystem can
 lose data sometimes,

Not if i.e. fsync returns.  If the data is gone later, it's a hardware
problem, or occasionally a bug - bugs that are usually found  fixed
pretty quickly.

 the difference being that those tend to give
 worse results when hardware is misbehaving than BTRFS does, because
 BTRFS usually has a old copy of whatever data structure gets
 corrupted to fall back on.

I'm curious, is that based on conjecture or real-world testing?

-Eric

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-09 Thread Austin S Hemmelgarn

On 2014-10-08 15:11, Eric Sandeen wrote:

I was looking at Marc's post:

http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.html

and it feels like there isn't exactly a cohesive, overarching vision for
repair of a corrupted btrfs filesystem.

In other words - I'm an admin cruising along, when the kernel throws some
fs corruption error, or for whatever reason btrfs fails to mount.
What should I do?

Marc lays out several steps, but to me this highlights that there seem to
be a lot of disjoint mechanisms out there to deal with these problems;
mostly from Marc's blog, with some bits of my own:

* btrfs scrub
Errors are corrected along if possible (what *is* possible?)
* mount -o recovery
Enable autorecovery attempts if a bad tree root is found at mount 
time.
* mount -o degraded
Allow mounts to continue with missing devices.
(This isn't really a way to recover from corruption, right?)
* btrfs-zero-log
remove the log tree if log tree is corrupt
* btrfs rescue
Recover a damaged btrfs filesystem
chunk-recover
super-recover
How does this relate to btrfs check?
* btrfs check
repair a btrfs filesystem
--repair
--init-csum-tree
--init-extent-tree
How does this relate to btrfs rescue?
* btrfs restore
try to salvage files from a damaged filesystem
(not really repair, it's disk-scraping)


What's the vision for, say, scrub vs. check vs. rescue?  Should they repair the
same errors, only online vs. offline?  If not, what class of errors does one 
fix vs.
the other?  How would an admin know?  Can btrfs check recover a bad tree root
in the same way that mount -o recovery does?  How would I know if I should use
--init-*-tree, or chunk-recover, and what are the ramifications of using
these options?

It feels like recovery tools have been badly splintered, and if there's an
overarching design or vision for btrfs fs repair, I can't tell what it is.
Can anyone help me?


Well, based on my understanding:
* btrfs scrub is intended to be almost exactly equivalent to scrubbing a 
RAID volume; that is, it fixes disparity between multiple copies of the 
same block.  IOW, it isn't really repair per se, but more preventative 
maintnence.  Currently, it only works for cases where you have multiple 
copies of a block (dup, raid1, and raid10 profiles), but support is 
planned for error correction of raid5 and raid6 profiles.
* mount -o recovery I don't know much about, but AFAICT, it s more for 
dealing with metadata related FS corruption.
* mount -o degraded is used to mount a fs configured for a raid storage 
profile with fewer devices than the profile minimum.  It's primarily so 
that you can get the fs into a state where you can run 'btrfs device 
replace'
* btrfs-zero-log only deals with log tree corruption.  This would be 
roughly equivalent to zeroing out the journal on an XFS or ext4 
filesystem, and should almost never be needed.
* btrfs rescue is intended for low level recovery corruption on an 
offline fs.
* chunk-recover I'm not entirely sure about, but I believe it's 
like scrub for a single chunk on an offline fs
* super-recover is for dealing with corrupted superblocks, and 
tries to replace it with one of the other copies (which hopefully isn't 
corrupted)
* btrfs check is intended to (eventually) be equivalent to the fsck 
utility for most other filesystems.  Currently, it's relatively good at 
identifying corruption, but less so at actually fixing it.  There are 
however, some things that it won't catch, like a superblock pointing to 
a corrupted root tree.
* btrfs restore is essentially disk scraping, but with built-in 
knowledge of the filesystem's on-disk structure, which makes it more 
reliable than more generic tools like scalpel for files that are too big 
to fit in the metadata blocks, and it is pretty much essential for 
dealing with transparently compressed files.


In general, my personal procedure for handling a misbehaving BTRFS 
filesystem is:
* Run btrfs check on it WITHOUT ANY OTHER OPTIONS to try to identify 
what's wrong

* Try mounting it using -o recovery
* Try mounting it using -o ro,recovery
* Use -o degraded only if it's a BTRFS raid set that lost a disk
* If btrfs check AND dmesg both seem to indicate that the log tree is 
corrupt, try btrfs-zero-log
* If btrfs check indicated a corrupt superblock, try btrfs rescue 
super-recover

* If all of the above fails, ask for advice on the mailing list or IRC
Also, you should be running btrfs scrub regularly to correct bit-rot and 
force remapping of blocks with read errors.  While BTRFS technically 
handles both transparently on reads, it only corrects thing on disk when 
you do a scrub.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: What is the vision for btrfs fs repair?

2014-10-09 Thread Duncan
Austin S Hemmelgarn posted on Thu, 09 Oct 2014 07:29:23 -0400 as
excerpted:

 Also, you should be running btrfs scrub regularly to correct bit-rot
 and force remapping of blocks with read errors.  While BTRFS
 technically handles both transparently on reads, it only corrects thing
 on disk when you do a scrub.

AFAIK that isn't quite correct.  Currently, the number of copies is 
limited to two, meaning if one of the two is bad, there's a 50% chance of 
btrfs reading the good one on first try.

If btrfs reads the good copy, it simply uses it.  If btrfs reads the bad 
one, it checks the other one and assuming it's good, replaces the bad one 
with the good one both for the read (which otherwise errors out), and by 
overwriting the bad one.

But here's the rub.  The chances of detecting that bad block are 
relatively low in most cases.  First, the system must try reading it for 
some reason, but even then, chances are 50% it'll pick the good one and 
won't even notice the bad one.

Thus, while btrfs may randomly bump into a bad block and rewrite it with 
the good copy, scrub is the only way to systematically detect and (if 
there's a good copy) fix these checksum errors.  It's not that btrfs 
doesn't do it if it finds them, it's that the chances of finding them are 
relatively low, unless you do a scrub, which systematically checks the 
entire filesystem (well, other than files marked nocsum, or nocow, which 
implies nocsum, or files written when mounted with nodatacow or 
nodatasum).

At least that's the way it /should/ work.  I guess it's possible that 
btrfs isn't doing those routine bump-into-it-and-fix-it fixes yet, but 
if so, that's the first /I/ remember reading of it.

Other than that detail, what you posted matches my knowledge and 
experience, such as it may be as a non-dev list regular, as well.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-09 Thread Hugo Mills
On Thu, Oct 09, 2014 at 11:53:23AM +, Duncan wrote:
 Austin S Hemmelgarn posted on Thu, 09 Oct 2014 07:29:23 -0400 as
 excerpted:
 
  Also, you should be running btrfs scrub regularly to correct bit-rot
  and force remapping of blocks with read errors.  While BTRFS
  technically handles both transparently on reads, it only corrects thing
  on disk when you do a scrub.
 
 AFAIK that isn't quite correct.  Currently, the number of copies is 
 limited to two, meaning if one of the two is bad, there's a 50% chance of 
 btrfs reading the good one on first try.

   Scrub checks both copies, though. It's ordinary reads that don't.

   Hugo.

 If btrfs reads the good copy, it simply uses it.  If btrfs reads the bad 
 one, it checks the other one and assuming it's good, replaces the bad one 
 with the good one both for the read (which otherwise errors out), and by 
 overwriting the bad one.
 
 But here's the rub.  The chances of detecting that bad block are 
 relatively low in most cases.  First, the system must try reading it for 
 some reason, but even then, chances are 50% it'll pick the good one and 
 won't even notice the bad one.
 
 Thus, while btrfs may randomly bump into a bad block and rewrite it with 
 the good copy, scrub is the only way to systematically detect and (if 
 there's a good copy) fix these checksum errors.  It's not that btrfs 
 doesn't do it if it finds them, it's that the chances of finding them are 
 relatively low, unless you do a scrub, which systematically checks the 
 entire filesystem (well, other than files marked nocsum, or nocow, which 
 implies nocsum, or files written when mounted with nodatacow or 
 nodatasum).
 
 At least that's the way it /should/ work.  I guess it's possible that 
 btrfs isn't doing those routine bump-into-it-and-fix-it fixes yet, but 
 if so, that's the first /I/ remember reading of it.
 
 Other than that detail, what you posted matches my knowledge and 
 experience, such as it may be as a non-dev list regular, as well.
 

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- Great oxymorons of the world, no. 7: The Simple Truth ---  


signature.asc
Description: Digital signature


Re: What is the vision for btrfs fs repair?

2014-10-09 Thread Austin S Hemmelgarn

On 2014-10-09 07:53, Duncan wrote:

Austin S Hemmelgarn posted on Thu, 09 Oct 2014 07:29:23 -0400 as
excerpted:


Also, you should be running btrfs scrub regularly to correct bit-rot
and force remapping of blocks with read errors.  While BTRFS
technically handles both transparently on reads, it only corrects thing
on disk when you do a scrub.


AFAIK that isn't quite correct.  Currently, the number of copies is
limited to two, meaning if one of the two is bad, there's a 50% chance of
btrfs reading the good one on first try.

If btrfs reads the good copy, it simply uses it.  If btrfs reads the bad
one, it checks the other one and assuming it's good, replaces the bad one
with the good one both for the read (which otherwise errors out), and by
overwriting the bad one.

But here's the rub.  The chances of detecting that bad block are
relatively low in most cases.  First, the system must try reading it for
some reason, but even then, chances are 50% it'll pick the good one and
won't even notice the bad one.

Thus, while btrfs may randomly bump into a bad block and rewrite it with
the good copy, scrub is the only way to systematically detect and (if
there's a good copy) fix these checksum errors.  It's not that btrfs
doesn't do it if it finds them, it's that the chances of finding them are
relatively low, unless you do a scrub, which systematically checks the
entire filesystem (well, other than files marked nocsum, or nocow, which
implies nocsum, or files written when mounted with nodatacow or
nodatasum).

At least that's the way it /should/ work.  I guess it's possible that
btrfs isn't doing those routine bump-into-it-and-fix-it fixes yet, but
if so, that's the first /I/ remember reading of it.


I'm not 100% certain, but I believe it doesn't actually fix things on 
disk when it detects an error during a read, I know it doesn't it the fs 
is mounted ro (even if the media is writable), because I did some 
testing to see how 'read-only' mounting a btrfs filesystem really is.


Also, that's a much better description of how multiple copies work than 
I could probably have ever given.





smime.p7s
Description: S/MIME Cryptographic Signature


Re: What is the vision for btrfs fs repair?

2014-10-09 Thread Hugo Mills
On Thu, Oct 09, 2014 at 08:07:51AM -0400, Austin S Hemmelgarn wrote:
 On 2014-10-09 07:53, Duncan wrote:
 Austin S Hemmelgarn posted on Thu, 09 Oct 2014 07:29:23 -0400 as
 excerpted:
 
 Also, you should be running btrfs scrub regularly to correct bit-rot
 and force remapping of blocks with read errors.  While BTRFS
 technically handles both transparently on reads, it only corrects thing
 on disk when you do a scrub.
 
 AFAIK that isn't quite correct.  Currently, the number of copies is
 limited to two, meaning if one of the two is bad, there's a 50% chance of
 btrfs reading the good one on first try.
 
 If btrfs reads the good copy, it simply uses it.  If btrfs reads the bad
 one, it checks the other one and assuming it's good, replaces the bad one
 with the good one both for the read (which otherwise errors out), and by
 overwriting the bad one.
 
 But here's the rub.  The chances of detecting that bad block are
 relatively low in most cases.  First, the system must try reading it for
 some reason, but even then, chances are 50% it'll pick the good one and
 won't even notice the bad one.
 
 Thus, while btrfs may randomly bump into a bad block and rewrite it with
 the good copy, scrub is the only way to systematically detect and (if
 there's a good copy) fix these checksum errors.  It's not that btrfs
 doesn't do it if it finds them, it's that the chances of finding them are
 relatively low, unless you do a scrub, which systematically checks the
 entire filesystem (well, other than files marked nocsum, or nocow, which
 implies nocsum, or files written when mounted with nodatacow or
 nodatasum).
 
 At least that's the way it /should/ work.  I guess it's possible that
 btrfs isn't doing those routine bump-into-it-and-fix-it fixes yet, but
 if so, that's the first /I/ remember reading of it.
 
 I'm not 100% certain, but I believe it doesn't actually fix things on disk
 when it detects an error during a read,

   I'm fairly sure it does, as I've had it happen to me. :)

 I know it doesn't it the fs is
 mounted ro (even if the media is writable), because I did some testing to
 see how 'read-only' mounting a btrfs filesystem really is.

   If the FS is RO, then yes, it won't fix things.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- Great films about cricket:  Interview with the Umpire ---  


signature.asc
Description: Digital signature


Re: What is the vision for btrfs fs repair?

2014-10-09 Thread Austin S Hemmelgarn

On 2014-10-09 08:12, Hugo Mills wrote:

On Thu, Oct 09, 2014 at 08:07:51AM -0400, Austin S Hemmelgarn wrote:

On 2014-10-09 07:53, Duncan wrote:

Austin S Hemmelgarn posted on Thu, 09 Oct 2014 07:29:23 -0400 as
excerpted:


Also, you should be running btrfs scrub regularly to correct bit-rot
and force remapping of blocks with read errors.  While BTRFS
technically handles both transparently on reads, it only corrects thing
on disk when you do a scrub.


AFAIK that isn't quite correct.  Currently, the number of copies is
limited to two, meaning if one of the two is bad, there's a 50% chance of
btrfs reading the good one on first try.

If btrfs reads the good copy, it simply uses it.  If btrfs reads the bad
one, it checks the other one and assuming it's good, replaces the bad one
with the good one both for the read (which otherwise errors out), and by
overwriting the bad one.

But here's the rub.  The chances of detecting that bad block are
relatively low in most cases.  First, the system must try reading it for
some reason, but even then, chances are 50% it'll pick the good one and
won't even notice the bad one.

Thus, while btrfs may randomly bump into a bad block and rewrite it with
the good copy, scrub is the only way to systematically detect and (if
there's a good copy) fix these checksum errors.  It's not that btrfs
doesn't do it if it finds them, it's that the chances of finding them are
relatively low, unless you do a scrub, which systematically checks the
entire filesystem (well, other than files marked nocsum, or nocow, which
implies nocsum, or files written when mounted with nodatacow or
nodatasum).

At least that's the way it /should/ work.  I guess it's possible that
btrfs isn't doing those routine bump-into-it-and-fix-it fixes yet, but
if so, that's the first /I/ remember reading of it.


I'm not 100% certain, but I believe it doesn't actually fix things on disk
when it detects an error during a read,


I'm fairly sure it does, as I've had it happen to me. :)
I probably just misinterpreted the source code, while I know enough C to 
generally understand things, I'm by far no expert.



I know it doesn't it the fs is
mounted ro (even if the media is writable), because I did some testing to
see how 'read-only' mounting a btrfs filesystem really is.


If the FS is RO, then yes, it won't fix things.

Hugo.






smime.p7s
Description: S/MIME Cryptographic Signature


Re: What is the vision for btrfs fs repair?

2014-10-09 Thread Duncan
On Thu, 09 Oct 2014 08:07:51 -0400
Austin S Hemmelgarn ahferro...@gmail.com wrote:

 On 2014-10-09 07:53, Duncan wrote:
  Austin S Hemmelgarn posted on Thu, 09 Oct 2014 07:29:23 -0400 as
  excerpted:
 
  Also, you should be running btrfs scrub regularly to correct
  bit-rot and force remapping of blocks with read errors.  While
  BTRFS technically handles both transparently on reads, it only
  corrects thing on disk when you do a scrub.
 
  AFAIK that isn't quite correct.  Currently, the number of copies is
  limited to two, meaning if one of the two is bad, there's a 50%
  chance of btrfs reading the good one on first try.
 
  If btrfs reads the good copy, it simply uses it.  If btrfs reads
  the bad one, it checks the other one and assuming it's good,
  replaces the bad one with the good one both for the read (which
  otherwise errors out), and by overwriting the bad one.
 
  But here's the rub.  The chances of detecting that bad block are
  relatively low in most cases.  First, the system must try reading
  it for some reason, but even then, chances are 50% it'll pick the
  good one and won't even notice the bad one.
 
  Thus, while btrfs may randomly bump into a bad block and rewrite it
  with the good copy, scrub is the only way to systematically detect
  and (if there's a good copy) fix these checksum errors.  It's not
  that btrfs doesn't do it if it finds them, it's that the chances of
  finding them are relatively low, unless you do a scrub, which
  systematically checks the entire filesystem (well, other than files
  marked nocsum, or nocow, which implies nocsum, or files written
  when mounted with nodatacow or nodatasum).
 
  At least that's the way it /should/ work.  I guess it's possible
  that btrfs isn't doing those routine bump-into-it-and-fix-it
  fixes yet, but if so, that's the first /I/ remember reading of it.
 
 I'm not 100% certain, but I believe it doesn't actually fix things on 
 disk when it detects an error during a read, I know it doesn't it the
 fs is mounted ro (even if the media is writable), because I did some 
 testing to see how 'read-only' mounting a btrfs filesystem really is.

Definitely it won't with a read-only mount.  But then scrub shouldn't
be able to write to a read-only mount either.  The only way a read-only
mount should be writable is if it's mounted (bind-mounted or
btrfs-subvolume-mounted) read-write elsewhere, and the write occurs to
that mount, not the read-only mounted location.

There's even debate about replaying the journal or doing orphan-delete
on read-only mounts (at least on-media, the change could, and arguably
should, occur in RAM and be cached, marking the cache dirty at the
same time so it's appropriately flushed if/when the filesystem goes
writable), with some arguing read-only means just that, don't
write /anything/ to it until it's read-write mounted.

But writable-mounted, detected checksum errors (with a good copy
available) should be rewritten as far as I know.  If not, I'd call it
a bug.  The problem is in the detection, not in the rewriting.  Scrub's
the only way to reliably detect these errors since it's the only thing
that systematically checks /everything/.

 Also, that's a much better description of how multiple copies work
 than I could probably have ever given.

Thanks.  =:^)

-- 
Duncan - No HTML messages please, as they are filtered as spam.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-09 Thread Duncan
On Thu, 9 Oct 2014 12:55:50 +0100
Hugo Mills h...@carfax.org.uk wrote:

 On Thu, Oct 09, 2014 at 11:53:23AM +, Duncan wrote:
  Austin S Hemmelgarn posted on Thu, 09 Oct 2014 07:29:23 -0400 as
  excerpted:
  
   Also, you should be running btrfs scrub regularly to correct
   bit-rot and force remapping of blocks with read errors.  While
   BTRFS technically handles both transparently on reads, it only
   corrects thing on disk when you do a scrub.
  
  AFAIK that isn't quite correct.  Currently, the number of copies is 
  limited to two, meaning if one of the two is bad, there's a 50%
  chance of btrfs reading the good one on first try.
 
Scrub checks both copies, though. It's ordinary reads that don't.

While I believe I was clear in full context (see below), agreed.  I was
talking about normal reads in the above, not scrub, as the full quote
should make clear.  I guess I could have made it clearer in the
immediate context, however.  Thanks.

  Thus, while btrfs may randomly bump into a bad block and rewrite it
  with the good copy, scrub is the only way to systematically detect
  and (if there's a good copy) fix these checksum errors.



-- 
Duncan - No HTML messages please, as they are filtered as spam.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-09 Thread Austin S Hemmelgarn

On 2014-10-09 08:34, Duncan wrote:

On Thu, 09 Oct 2014 08:07:51 -0400
Austin S Hemmelgarn ahferro...@gmail.com wrote:


On 2014-10-09 07:53, Duncan wrote:

Austin S Hemmelgarn posted on Thu, 09 Oct 2014 07:29:23 -0400 as
excerpted:


Also, you should be running btrfs scrub regularly to correct
bit-rot and force remapping of blocks with read errors.  While
BTRFS technically handles both transparently on reads, it only
corrects thing on disk when you do a scrub.


AFAIK that isn't quite correct.  Currently, the number of copies is
limited to two, meaning if one of the two is bad, there's a 50%
chance of btrfs reading the good one on first try.

If btrfs reads the good copy, it simply uses it.  If btrfs reads
the bad one, it checks the other one and assuming it's good,
replaces the bad one with the good one both for the read (which
otherwise errors out), and by overwriting the bad one.

But here's the rub.  The chances of detecting that bad block are
relatively low in most cases.  First, the system must try reading
it for some reason, but even then, chances are 50% it'll pick the
good one and won't even notice the bad one.

Thus, while btrfs may randomly bump into a bad block and rewrite it
with the good copy, scrub is the only way to systematically detect
and (if there's a good copy) fix these checksum errors.  It's not
that btrfs doesn't do it if it finds them, it's that the chances of
finding them are relatively low, unless you do a scrub, which
systematically checks the entire filesystem (well, other than files
marked nocsum, or nocow, which implies nocsum, or files written
when mounted with nodatacow or nodatasum).

At least that's the way it /should/ work.  I guess it's possible
that btrfs isn't doing those routine bump-into-it-and-fix-it
fixes yet, but if so, that's the first /I/ remember reading of it.


I'm not 100% certain, but I believe it doesn't actually fix things on
disk when it detects an error during a read, I know it doesn't it the
fs is mounted ro (even if the media is writable), because I did some
testing to see how 'read-only' mounting a btrfs filesystem really is.


Definitely it won't with a read-only mount.  But then scrub shouldn't
be able to write to a read-only mount either.  The only way a read-only
mount should be writable is if it's mounted (bind-mounted or
btrfs-subvolume-mounted) read-write elsewhere, and the write occurs to
that mount, not the read-only mounted location.

In theory yes, but there are caveats to this, namely:
* atime updates still happen unless you have mounted the fs with noatime
* The superblock gets updated if there are 'any' writes
* The free space cache 'might' be updated if there are any writes

All in all, a BTRFS filesystem mounted ro is much more read-only than 
say ext4 (which at least updates the sb, and old versions replayed the 
journal, in addition to the atime updates).


There's even debate about replaying the journal or doing orphan-delete
on read-only mounts (at least on-media, the change could, and arguably
should, occur in RAM and be cached, marking the cache dirty at the
same time so it's appropriately flushed if/when the filesystem goes
writable), with some arguing read-only means just that, don't
write /anything/ to it until it's read-write mounted.

But writable-mounted, detected checksum errors (with a good copy
available) should be rewritten as far as I know.  If not, I'd call it
a bug.  The problem is in the detection, not in the rewriting.  Scrub's
the only way to reliably detect these errors since it's the only thing
that systematically checks /everything/.


Also, that's a much better description of how multiple copies work
than I could probably have ever given.


Thanks.  =:^)






smime.p7s
Description: S/MIME Cryptographic Signature


Re: What is the vision for btrfs fs repair?

2014-10-09 Thread Duncan
Austin S Hemmelgarn posted on Thu, 09 Oct 2014 09:18:22 -0400 as
excerpted:

 On 2014-10-09 08:34, Duncan wrote:

 The only way a read-only
 mount should be writable is if it's mounted (bind-mounted or
 btrfs-subvolume-mounted) read-write elsewhere, and the write occurs to
 that mount, not the read-only mounted location.

 In theory yes, but there are caveats to this, namely:
 * atime updates still happen unless you have mounted the fs with noatime

I've been mounting noatime for well over a decade now, exactly due to 
such problems.  But I believe at least /some/ filesystems are truly read-
only when they're mounted as such, and atime updates don't happen on them.

These days I actually apply a patch that changes the default relatime to 
noatime, so I don't even have to have it in my mount-options. =:^)

 * The superblock gets updated if there are 'any' writes

Yeah.  At least in theory, there shouldn't be, however.  As I said, in 
theory, even journal replay and orphan delete shouldn't hit media, altho 
handling it in memory and dirtying the cache, so if the filesystem is 
ever remounted read-write they get written, is reasonable.

 * The free space cache 'might' be updated if there are any writes

Makes sense.  But of course that's what I'm arguing, there shouldn't /be/ 
any writes.  Read-only should mean exactly that, don't touch media, 
period.

I remember at one point activating an mdraid1 degraded, read-only, just a 
single device of the 4-way raid1 I was running at the time, to recover 
data from it after the system it was running in died.  The idea was don't 
write to the device at all, because I was still testing the new system, 
and in case I decided to try to reassemble the raid at some point.  Read-
only really NEEDS to be read-only, under such conditions.

Similarly for forensic examination, of course.  If there's a write, any 
write, it's evidence tampering.  Read-only needs to MEAN read-only!

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-09 Thread Eric Sandeen
On 10/9/14 8:49 AM, Duncan wrote:
 Austin S Hemmelgarn posted on Thu, 09 Oct 2014 09:18:22 -0400 as
 excerpted:
 
 On 2014-10-09 08:34, Duncan wrote:
 
 The only way a read-only
 mount should be writable is if it's mounted (bind-mounted or
 btrfs-subvolume-mounted) read-write elsewhere, and the write occurs to
 that mount, not the read-only mounted location.
 
 In theory yes, but there are caveats to this, namely:
 * atime updates still happen unless you have mounted the fs with noatime

Getting off the topic a bit, but that really shouldn't happen:

#define IS_NOATIME(inode)   __IS_FLG(inode, MS_RDONLY|MS_NOATIME)

and in touch_atime():

if (IS_NOATIME(inode))
return;

-Eric
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-09 Thread Chris Murphy

On Oct 8, 2014, at 3:11 PM, Eric Sandeen sand...@redhat.com wrote:

 I was looking at Marc's post:
 
 http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.html
 
 and it feels like there isn't exactly a cohesive, overarching vision for
 repair of a corrupted btrfs filesystem.

It's definitely confusing compared to any other filesystem I've used on four 
different platforms. And that's when excluding scraping and the functions 
unique to any multiple device volume: scrubs, degraded mount.

To be fair, mdadm doesn't even have a scrub command, it's done via 'echo check 
 /sys/block/mdX/md/sync_action'. And meanwhile LVM has pvck, vgck, and for 
scrubs it's lvchange --syncaction {check|repair}. These are also completely 
non-obvious.

 * mount -o recovery
   Enable autorecovery attempts if a bad tree root is found at mount 
 time.

I'm confused why it's not the default yet. Maybe it's continuing to evolve at a 
pace that suggests something could sneak in that makes things worse? It is 
almost an oxymoron in that I'm manually enabling an autorecovery

 If true, maybe the closest indication we'd get of btrfs stablity is the 
default enabling of autorecovery.

 * btrfs-zero-log
   remove the log tree if log tree is corrupt
 * btrfs rescue
   Recover a damaged btrfs filesystem
   chunk-recover
   super-recover
   How does this relate to btrfs check?
 * btrfs check
   repair a btrfs filesystem
   --repair
   --init-csum-tree
   --init-extent-tree
   How does this relate to btrfs rescue?

These three translate into eight combinations of repairs, adding -o recovery 
there are 9 combinations. I think this is the main source of confusion, there 
are just too many options, but also it's completely non-obvious which one to 
use in which situation.

My expectation is that eventually these get consolidated into just check and 
check --repair. As the repair code matures, it'd go into kernel autorecovery 
code. That's a guess on my part, but it's consistent with design goals.


 It feels like recovery tools have been badly splintered, and if there's an
 overarching design or vision for btrfs fs repair, I can't tell what it is.
 Can anyone help me?

I suspect it's unintended splintering, and is an artifact that will go away. 
I'd rather the convoluted, fractured nature of repair go away before the scary 
experimental warnings do.


Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-09 Thread Duncan
Chris Murphy posted on Thu, 09 Oct 2014 21:58:53 -0400 as excerpted:

 I suspect it's unintended splintering, and is an artifact that will go
 away. I'd rather the convoluted, fractured nature of repair go away
 before the scary experimental warnings do.

Heh, agreed with everything[1], but too late for this, the experimental 
warnings are peeled off, the experimental or at least horribly immature
/behavior/ remains. =:^(

---
[1] ... and a much more logically cohesive and well structured reply than 
I could have managed as my own thoughts simply weren't that well 
organized.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


What is the vision for btrfs fs repair?

2014-10-08 Thread Eric Sandeen
I was looking at Marc's post:

http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.html

and it feels like there isn't exactly a cohesive, overarching vision for
repair of a corrupted btrfs filesystem.

In other words - I'm an admin cruising along, when the kernel throws some
fs corruption error, or for whatever reason btrfs fails to mount.
What should I do?

Marc lays out several steps, but to me this highlights that there seem to
be a lot of disjoint mechanisms out there to deal with these problems;
mostly from Marc's blog, with some bits of my own:

* btrfs scrub
Errors are corrected along if possible (what *is* possible?)
* mount -o recovery
Enable autorecovery attempts if a bad tree root is found at mount 
time.
* mount -o degraded
Allow mounts to continue with missing devices.
(This isn't really a way to recover from corruption, right?)
* btrfs-zero-log
remove the log tree if log tree is corrupt
* btrfs rescue
Recover a damaged btrfs filesystem
chunk-recover
super-recover
How does this relate to btrfs check?
* btrfs check
repair a btrfs filesystem
--repair
--init-csum-tree
--init-extent-tree
How does this relate to btrfs rescue?
* btrfs restore
try to salvage files from a damaged filesystem
(not really repair, it's disk-scraping)


What's the vision for, say, scrub vs. check vs. rescue?  Should they repair the
same errors, only online vs. offline?  If not, what class of errors does one 
fix vs.
the other?  How would an admin know?  Can btrfs check recover a bad tree root
in the same way that mount -o recovery does?  How would I know if I should use
--init-*-tree, or chunk-recover, and what are the ramifications of using
these options?

It feels like recovery tools have been badly splintered, and if there's an
overarching design or vision for btrfs fs repair, I can't tell what it is.
Can anyone help me?

Thanks,
-Eric
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html