Can I drop/reset files with corrupted data if they are in a read only snapshot?

2017-07-08 Thread Marc MERLIN
Sorry for the mails, I still have one more problem I'm trying to work
through.

My filesystem that probably got real corruption due to an unstable block
layer underneath (my 2 other machines with other problems did not have
an unstable block layer and just started having problem recently, which
is why I'm wondering if 4.11 has issues over 4.8 which I was running.
before)

I have 2 filesystems that mostly pass check --repair, except for some
checksum verify errors which do not get fixed/reset by check --repair.

btrfs scrub shows 2 files that have issues.

First, check --repair should tell me what those "on 739295232" refer to.
Can't it tell me what pathname it refers to? That would seem like the
useful thing to do :)

Second, ok, after a 24H scrub (yes, it's long and slow), I know which
filenames have issues. Problem is that they are inside a read only btrfs
snapshot. I cannot delete this snapshot because if I do so, I will
destroy a btrfs send/receive relationship that will take 2 + 1 day to
recreate (2 filesystems both have 2 files to delete each).
How can I force delete the file anyway, or reset the checksum and accept 
that the file is corrupted, but not care?
(I've deleted it and the next btrfs send/receive will free the blocks anyway)

Yes, of course you're going to tell me "well, just btrfs send/receive
the new subvolume with the file deleted and then you can delete the old snapshot
and free the corrupted data blocks".
That sounds like a grand idea except for the fact that that the filesystem
I'm syncing to is the one that's even more corrupted (previous post where
btrfs check --repair just bails) and I'm not really interested in deleting 
and recreating that filesystem until I've gotten the source back to a
100% consistent state.


Here's the not very helpful check --repair output which doesn't actually
fix this error (it should have an option to show the pathames and reset
the checksum to pass, giving me a consistent file with corrupted data as
opposed to a corrupted file that will keep giving scrub errors).


gargamel:/mnt/dshelf2# btrfs check --repair  /dev/mapper/dshelf2
enabling repair mode
Checking filesystem on /dev/mapper/dshelf2
UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
checking extents
checksum verify failed on 739295232 found AB0CFE79 wanted E49CEF52
checksum verify failed on 739295232 found AB0CFE79 wanted E49CEF52
checksum verify failed on 1004978176 found 28A151B1 wanted 78F6E880
checksum verify failed on 1004978176 found 28A151B1 wanted 78F6E880
checksum verify failed on 1004994560 found D6F1289C wanted 0FA88800
checksum verify failed on 1004994560 found D6F1289C wanted 0FA88800
checksum verify failed on 1005010944 found 977BEF09 wanted 22373398
checksum verify failed on 1005010944 found 977BEF09 wanted 22373398
checksum verify failed on 1005027328 found 34BEB207 wanted E6A513DF
checksum verify failed on 1005027328 found 34BEB207 wanted E6A513DF
checksum verify failed on 1005043712 found F5D4AE42 wanted 12BB8F8A
checksum verify failed on 1005043712 found F5D4AE42 wanted 12BB8F8A
checksum verify failed on 1005305856 found 1BF2C6B2 wanted 47612155
checksum verify failed on 1005305856 found 1BF2C6B2 wanted 47612155
checksum verify failed on 1005322240 found 9D6E28D3 wanted 62A9226F
checksum verify failed on 1005322240 found 9D6E28D3 wanted 62A9226F
checksum verify failed on 1005338624 found 43C7415B wanted 0EA181CD
checksum verify failed on 1005338624 found 43C7415B wanted 0EA181CD
checksum verify failed on 1005355008 found 34412580 wanted CE332649
checksum verify failed on 1005355008 found 34412580 wanted CE332649
checksum verify failed on 1005371392 found 1C4E7E82 wanted 45E50CFA
checksum verify failed on 1005371392 found 1C4E7E82 wanted 45E50CFA
checksum verify failed on 1005633536 found 807C372E wanted 43C01363
checksum verify failed on 1005633536 found 807C372E wanted 43C01363
checksum verify failed on 1005649920 found 394F7D66 wanted 33287C40
checksum verify failed on 1005649920 found 394F7D66 wanted 33287C40
checksum verify failed on 1005666304 found EB0C7401 wanted B4F6D008
checksum verify failed on 1005666304 found EB0C7401 wanted B4F6D008
checksum verify failed on 1005682688 found AC3B9712 wanted 1929DF15
checksum verify failed on 1005682688 found AC3B9712 wanted 1929DF15
checksum verify failed on 1005699072 found 2D97416A wanted 9ED13B7A
checksum verify failed on 1005699072 found 2D97416A wanted 9ED13B7A
checksum verify failed on 1005961216 found 38C53268 wanted 498134D2
checksum verify failed on 1005961216 found 38C53268 wanted 498134D2
checksum verify failed on 1005977600 found 83FDF0D8 wanted E053CB4C
checksum verify failed on 1005977600 found 83FDF0D8 wanted E053CB4C
checksum verify failed on 1005993984 found FC14EAA1 wanted 77CC1138
checksum verify failed on 1005993984 found FC14EAA1 wanted 77CC1138
checksum verify failed on 1004716032 found 0D81ACC5 wanted 7D183AC3
checksum verify failed on 1004716032 found 0D81ACC5 wanted 7D183AC3
checksum verify failed on 

We really need a better/working btrfs check --repair

2017-07-08 Thread Marc MERLIN
+Chris

On Sat, Jul 08, 2017 at 09:34:17PM -0700, Marc MERLIN wrote:
> gargamel:/var/local/scr/host# btrfs check --repair /dev/mapper/crypt_bcache2 
> enabling repair mode
> Checking filesystem on /dev/mapper/crypt_bcache2
> UUID: c4e6f9ca-e9a2-43d7-befa-763fc2cd5a57
> checking extents
> ref mismatch on [14655689654272 16384] extent item 0, found 1
> Backref 14655689654272 parent 15455 root 15455 not found in extent tree
> backpointer mismatch on [14655689654272 16384]
> owner ref check failed [14655689654272 16384]
> repair deleting extent record: key 14655689654272 169 1
> adding new tree backref on start 14655689654272 len 16384 parent 0 root 15455
> Repaired extent references for 14655689654272
> root 15455 has a root item with a more recent gen (33682) compared to the 
> found root node (0)
> ERROR: failed to repair root items: Invalid argument

On this note, getting hit 3 times on 3 different filesystems, that are not
badly damaged, but in none of those caess can btrfs check --repair put them
in a working state, is really bringing home the problem with lack of proper
fsck.

I understand that some errors are hard to fix without unknown data loss, but
btrfs check --repair should just do what it takes to put the filesystem back
into a consistent state, never mind what data is lost.
Restoring 10 to 20TB of data is getting old and is not really an acceptable
answer as the only way out.
I should not have to recreate a filesystem as the only way to bring it back
to a working state. 

Before Duncan tells me my filesystem is too big, and I should keep to very
small filesystems so that it's less work for each time btrfs gets corrupted
again, and fails again to bring back the filesystem to a usable state after
discarding some data, that's just not an acceptable answer long term, and by
long term honestly I mean now.
I just have data that doesn't segment well and the more small filesystems I
make the more time I'm going to waste managing them all and dealing with
which one gets full first :(

So, whether 4.11 has a corruption problem, or not, please put some resources
behind btrfs check --repair, be it the lowmem mode, or not.

Thank you
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)

2017-07-08 Thread Marc MERLIN
Sigh,

This is now the 3rd filesystem I have (on 3 different machines) that is
getting corruption of some kind (on 4.11.6).
This is starting to look suspicious :-/

Can I fix this filesystem in some other way?
gargamel:/var/local/scr/host# btrfs check --repair /dev/mapper/crypt_bcache2 
enabling repair mode
Checking filesystem on /dev/mapper/crypt_bcache2
UUID: c4e6f9ca-e9a2-43d7-befa-763fc2cd5a57
checking extents
ref mismatch on [14655689654272 16384] extent item 0, found 1
Backref 14655689654272 parent 15455 root 15455 not found in extent tree
backpointer mismatch on [14655689654272 16384]
owner ref check failed [14655689654272 16384]
repair deleting extent record: key 14655689654272 169 1
adding new tree backref on start 14655689654272 len 16384 parent 0 root 15455
Repaired extent references for 14655689654272
root 15455 has a root item with a more recent gen (33682) compared to the found 
root node (0)
ERROR: failed to repair root items: Invalid argument

Recreating the filesystem is going to take me a week of work, a lot of if
manual, and I'm not feeling very good with doing this since the backup
server this is a backup of, is also seeing some hopefully minor) problems
too.

I really hope there isn't a new corruption problem in 4.11, because when
I'm getting corruption on my laptop, my backup server, and the backup of my
backup server, I'm starting to run out of redundant backups :(
(and I'm not mentioning all the time this is costing me)

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid10 array lost with single disk failure?

2017-07-08 Thread Duncan
Adam Bahe posted on Fri, 07 Jul 2017 23:26:31 -0500 as excerpted:

> I did recently upgrade the kernel a few days ago from
> 4.8.7-1.el7.elrepo.x86_64 to 4.10.6-1.el7.elrepo.x86_64. I had also
> added a new 6TB disk a few days ago but I'm not sure if the balance
> finished as it locked up sometime today when I was at work. Any ideas
> how I can recover? Even if I have 1 bad disk, raid10 should have kept my
> data safe no? Is there anything I can do to recover?

Yes, btrfs raid10 should be fine with a single bad device.  That's 
unlikely to be the issue.

But you did well to bring up the balance.  Have you tried mounting with 
the "skip_balance" mount option?

Sometimes a balance will run into a previously undetected problem with 
the filesystem and crash.  While mounting would otherwise still work, as 
soon as the filesystem goes active at the kernel level and before the 
mount call returns to userspace, the kernel will see the in-progress 
balance and attempt to continue it.  But if it crashed while processing a 
particular block group (aka chunk), of course that's the first one in 
line to continue the balance with, which will naturally crash again as it 
comes to the same inconsistency that triggered the crash the first time.

So the skip_balance mount option was invented to create a work-around and 
allow you to mount the filesystem again. =:^)

The fact that it sits there for awhile trying to do IO on all devices 
before it crashes is another clue it's probably the resumed balance 
crashing things as it comes to the same inconsistency that triggered the 
original crash during balance, so it's very likely that skip_balance will 
help. =:^)

Assuming that lets you mount, the next thing I'd try is a btrfs scrub.  
Chances are it'll find some checksum problems, but given that you're 
running raid10, there's a second copy it can try to use to correct the 
bad one and there's a reasonably good chance scrub will find and fix your 
problems.  Even if it can't fix them all, it should get you closer, with 
less chance at making things worse instead of better than more risky 
options such as btrfs check with --repair.

If a scrub completes with no uncorrected errors, I'd do an umount/mount 
cycle or reboot just to be sure -- don't forget the skip_balance option 
again tho -- and then, ensuring you're not doing anything that a crash 
would interrupt and have taken the opportunity presented to update your 
backups if you need to and assuming you consider the data worth more than 
the time/trouble/resources required for a backup, try a balance resume.

Once the balance resume gets reasonably past the time it otherwise took 
to crash, you can reasonably assume you've safely corrected at least 
/that/ inconsistency, and hope the scrub took care of any others before 
you got to them.

But of course all scrub does is verify checksums and where there's a 
second copy (as there is with dup, raid1 and raid10 modes) attempt a 
repair of the bad copy with the second one, of course verifying it as 
well in the process.  If the second copy of that block is bad too or in 
cases where there isn't such a second copy, it'll detect but not be able 
to fix the block with a bad checksum, and if the block has a valid 
checksum but is logically invalid for other reasons, scrub won't detect 
it, because /all/ it does is verify checksums, not actual filesystem 
consistency.  That's what the somewhat more risky (if --repair or other 
fix option is used, not in read-only mode, which detects but doesn't 
attempt to fix) btrfs check is for.

So if skip_balance doesn't work, or it does but scrub can't fix all the 
errors it finds, or scrub fixes everything it detects but a balance 
resume still crashes, then it's time to try riskier fixes.  I'll let 
others guide you there if needed, but will leave you with one reminder...

Sysadmin's first rule of backups:

Don't test fate and challenge reality!  Have your backups or regardless 
of claims to the contrary you're defining your data as throw-away value, 
and eventually, fate and reality are going to call you on it!

So don't worry too much even if you lose the filesystem.  Either you have 
backups and can restore from them should it be necessary, or you defined 
the data as not worth the trouble of those backups, and losing it isn't a 
big deal, because in either case you saved what was truly important to 
you, either the data because it was important enough to you to have 
backups, or the time/resources/trouble you would have spent doing those 
backups, which you still saved regardless of whether you can save the 
data or not. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid10 array lost with single disk failure?

2017-07-08 Thread Duncan
Adam Bahe posted on Fri, 07 Jul 2017 23:40:20 -0500 as excerpted:

> Some additional information. I am running Rockstor just like Daniel
> Brady noted in his post just before mine titled "Chunk root problem".
> Sorry I am somewhat unfamiliar with newsgroups so I am not sure how to
> reply to his thread before I was subscribed. But I am noticing something
> in my logs very similar to his, I get:
> 
> [  716.902506] BTRFS error (device sdb): failed to read the system
> array: -5
> [  716.918284] BTRFS error (device sdb): open_ctree failed
> [  717.004162] BTRFS warning (device sdb): 'recovery' is deprecated,
> use 'usebackuproot' instead
> [  717.004165] BTRFS info (device sdb): trying to use backup root at
> mount time
> [  717.004167] BTRFS info (device sdb): disk space caching is enabled
> [  717.004168] BTRFS info (device sdb): has skinny extents
> [  717.005673] BTRFS error (device sdb): failed to read the system
> array: -5
> [  717.020248] BTRFS error (device sdb): open_ctree failed
> 
> He also received a similar open_ctree failed message after he upgraded
> his kernel on Rockstor to 4.10.6-1.el7.elrepo.x86_64 and
> btrfs-progs-4.10.1-0.rockstor.x86_64.

FWIW that's not significant.  Open ctree failed is simply the generic 
btrfs failure to mount message.  It tells you nothing of the real 
problem, so other clues must be used to discern that.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: interrupt btrfs filesystem defragment -r /

2017-07-08 Thread Hugo Mills
On Sat, Jul 08, 2017 at 01:34:44PM +0200, David Arendt wrote:
> Hi,
> 
> Is it safe to interrupt a btrfs filesystem defrag -r / by using ctrl-c
> or should it be avoided ?

   Yes, it's safe.

   Hugo.

-- 
Hugo Mills | Klytus, I'm bored. What plaything can you offer me
hugo@... carfax.org.uk | today?
http://carfax.org.uk/  |
PGP: E2AB1DE4  |  Ming the Merciless, Flash Gordon


signature.asc
Description: Digital signature


interrupt btrfs filesystem defragment -r /

2017-07-08 Thread David Arendt
Hi,

Is it safe to interrupt a btrfs filesystem defrag -r / by using ctrl-c
or should it be avoided ?

Thanks in advance,

David Arendt

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html