Re: Is stability a joke?

2016-09-12 Thread Zoiled

Chris Mason wrote:



On 09/11/2016 04:55 AM, Waxhead wrote:

I have been following BTRFS for years and have recently been starting to
use BTRFS more and more and as always BTRFS' stability is a hot topic.
Some says that BTRFS is a dead end research project while others claim
the opposite.

Taking a quick glance at the wiki does not say much about what is safe
to use or not and it also points to some who are using BTRFS in 
production.

While BTRFS can apparently work well in production it does have some
caveats, and finding out what features is safe or not can be problematic
and I especially think that new users of BTRFS can easily be bitten if
they do not do a lot of research on it first.

The Debian wiki for BTRFS (which is recent by the way) contains a bunch
of warnings and recommendations and is for me a bit better than the
official BTRFS wiki when it comes to how to decide what features to use.

The Nouveau graphics driver have a nice feature matrix on it's webpage
and I think that BTRFS perhaps should consider doing something like that
on it's official wiki as well

For example something along the lines of  (the statuses are taken
our of thin air just for demonstration purposes)



The out of thin air part is a little confusing, I'm not sure if you're 
basing this on reports you've read?


Well to be honest I used "whatever I felt was right" more or less in 
that table and as I wrote it was only for demonstration purposes only to 
show how such a table could look.
I'm in favor flagged device replace with raid5/6 as not supported yet. 
That seems to be where most of the problems are coming in.


The compression framework shouldn't allow one to work well with the 
other unusable.
Ok good to know , however from the Debian wiki as well as the link to 
the mailing list only LZO compression are mentioned (as far as I 
remember) and I have no idea myself how much difference there is between 
LZO and the ZLIB code,


There were  problems with autodefrag related to snapshot-aware defrag, 
so Josef disabled the snapshot aware part.


In general, we put btrfs through heavy use at facebook.  The crcs have 
found serious hardware problems the other filesystems missed.


We've also uncovered performance problems and a some serious bugs, 
both in btrfs and the other filesystems.  With the other filesystems 
the fixes were usually upstream (doubly true for the most serious 
problems), and with btrfs we usually had to make the fixes ourselves.


-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

I'll just pop this in here since I assume most people will read the 
response from your comment:


I think I made my point. The wiki lacks some good documentation on 
what's safe to use and what's not. Yesterday I (Svein Engelsgjerd) did 
put a table on the main wiki and someone have moved that away to a 
status page and also improved the layout a bit. It is a tad more complex 
than my version, but also a lot better for the slightly more advanced 
users and it actually made my view on things a bit clearer as well.


I am glad that I by bringing this up (hopefully) contributed slightly to 
improving the documentation a tiny bit! :)


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is stability a joke?

2016-09-11 Thread Zoiled

Martin Steigerwald wrote:

Am Sonntag, 11. September 2016, 10:55:21 CEST schrieb Waxhead:

I have been following BTRFS for years and have recently been starting to
use BTRFS more and more and as always BTRFS' stability is a hot topic.
Some says that BTRFS is a dead end research project while others claim
the opposite.

First off: On my systems BTRFS definately runs too stable for a research
project. Actually: I have zero issues with stability of BTRFS on *any* of my
systems at the moment and in the last half year.

The only issue I had till about half an year ago was BTRFS getting stuck at
seeking free space on a highly fragmented RAID 1 + compress=lzo /home. This
went away with either kernel 4.4 or 4.5.

Additionally I never ever lost even a single byte of data on my own BTRFS
filesystems. I had a checksum failure on one of the SSDs, but BTRFS RAID 1
repaired it.


Where do I use BTRFS?

1) On this ThinkPad T520 with two SSDs. /home and / in RAID 1, another data
volume as single. In case you can read german, search blog.teamix.de for
BTRFS.

2) On my music box ThinkPad T42 for /home. I did not bother to change / so far
and may never to so for this laptop. It has a slow 2,5 inch harddisk.

3) I used it on Workstation at work as well for a data volume in RAID 1. But
workstation is no more (not due to a filesystem failure).

4) On a server VM for /home with Maildirs and Owncloud data. /var is still on
Ext4, but I want to migrate it as well. Whether I ever change /, I don´t know.

5) On another server VM, a backup VM which I currently use with borgbackup.
With borgbackup I actually wouldn´t really need BTRFS, but well…

6) On *all* of my externel eSATA based backup harddisks for snapshotting older
states of the backups.
In other words, you are one of those who claim the opposite :) I have 
also myself run btrfs for a "toy" filesystem since 2013 without any 
issues, but this is more or less irrelevant since some people have 
experienced data loss thanks to unstable features that are not clearly 
marked as such.
And making a claim that you have not lost a single byte of data does not 
make sense, how did you test this? SHA256 against a backup? :)

The Debian wiki for BTRFS (which is recent by the way) contains a bunch
of warnings and recommendations and is for me a bit better than the
official BTRFS wiki when it comes to how to decide what features to use.

Nice page. I wasn´t aware of this one.

If you use BTRFS with Debian, I suggest to usually use the recent backport
kernel, currently 4.6.

Hmmm, maybe I better remove that compress=lzo mount option. Never saw any
issue with it, tough. Will research what they say about it.
My point exactly: You did not know about this and hence the risk of your 
data being gnawed on.

The Nouveau graphics driver have a nice feature matrix on it's webpage
and I think that BTRFS perhaps should consider doing something like that
on it's official wiki as well

BTRFS also has a feature matrix. The links to it are in the "News" section
however:

https://btrfs.wiki.kernel.org/index.php/Changelog#By_feature
I disagree, this is not a feature / stability matrix. It is a clearly a 
changelog by kernel version.

Thing is: This just seems to be when has a feature been implemented matrix.
Not when it is considered to be stable. I think this could be done with colors
or so. Like red for not supported, yellow for implemented and green for
production ready.
Exactly, just like the Nouveau matrix. It clearly shows what you can 
expect from it.

Another hint you can get by reading SLES 12 releasenotes. SUSE dares to
support BTRFS since quite a while – frankly, I think for SLES 11 SP 3 this was
premature, at least for the initial release without updates, I have a VM that
with BTRFS I can break very easily having BTRFS say it is full, while it is
has still 2 GB free. But well… this still seems to happen for some people
according to the threads on BTRFS mailing list.

SUSE doesn´t support all of BTRFS. They even put features they do not support
behind a "allow_unsupported=1" module option:

https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12/#fate-314697

But they even seem to contradict themselves by claiming they support RAID 0,
RAID 1 and RAID10, but not RAID 5 or RAID 6, but then putting RAID behind that
module option – or I misunderstood their RAID statement

"Btrfs is supported on top of MD (multiple devices) and DM (device mapper)
configurations. Use the YaST partitioner to achieve a proper setup.
Multivolume Btrfs is supported in RAID0, RAID1, and RAID10 profiles in SUSE
Linux Enterprise 12, higher RAID levels are not yet supported, but might be
enabled with a future service pack."

and they only support BTRFS on MD for RAID. They also do not support
compression yet. They even do not support big metadata.

https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12/#fate-317221

Interestingly enough RedHat only supports BTRFS as a technology preview, even
with RHEL 7.

I would much rather 

Re: btrfs filesystem usage - Wrong Unallocated indications - RAID10

2016-05-23 Thread Zoiled

Marco Lorenzo Crociani wrote:

Hi,
as I wrote today in IRCI experienced an issue with 'btrfs filesystem 
usage'.

I have a 4 partitions RAID10 btrfs filesystem almost full.
'btrfs filesystem usage' reports wrong "Unallocated" indications.

Linux 4.5.3
btrfs-progs v4.5.3


# btrfs fi usage /data/

Overall:
Device size:  13.93TiB
Device allocated:  13.77TiB
Device unallocated: 167.54GiB
Device missing: 0.00B
Used:  13.44TiB
Free (estimated): 244.39GiB(min: 244.39GiB)
Data ratio:  2.00
Metadata ratio:  2.00
Global reserve: 512.00MiB(used: 0.00B)

Data,single: Size:8.00MiB, Used:0.00B
   /dev/sda4   8.00MiB

Data,RAID10: Size:6.87TiB, Used:6.71TiB
   /dev/sda4   1.72TiB
   /dev/sdb3   1.72TiB
   /dev/sdc3   1.72TiB
   /dev/sdd3   1.72TiB

Metadata,single: Size:8.00MiB, Used:0.00B
   /dev/sda4   8.00MiB

Metadata,RAID10: Size:19.00GiB, Used:14.15GiB
   /dev/sda4   4.75GiB
   /dev/sdb3   4.75GiB
   /dev/sdc3   4.75GiB
   /dev/sdd3   4.75GiB

System,single: Size:4.00MiB, Used:0.00B
   /dev/sda4   4.00MiB

System,RAID10: Size:16.00MiB, Used:768.00KiB
   /dev/sda4   4.00MiB
   /dev/sdb3   4.00MiB
   /dev/sdc3   4.00MiB
   /dev/sdd3   4.00MiB

Unallocated:
   /dev/sda4   1.76TiB
   /dev/sdb3   1.76TiB
   /dev/sdc3   1.76TiB
   /dev/sdd3   1.76TiB

-- 


# btrfs fi show /data/
Label: 'data'  uuid: df6639d5-3ef2-4ff6-a871-9ede440e2dae
Total devices 4 FS bytes used 6.72TiB
devid1 size 3.48TiB used 3.44TiB path /dev/sda4
devid2 size 3.48TiB used 3.44TiB path /dev/sdb3
devid3 size 3.48TiB used 3.44TiB path /dev/sdc3
devid4 size 3.48TiB used 3.44TiB path /dev/sdd3

-- 


# btrfs fi df /data/
Data, RAID10: total=6.87TiB, used=6.71TiB
Data, single: total=8.00MiB, used=0.00B
System, RAID10: total=16.00MiB, used=768.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, RAID10: total=19.00GiB, used=14.15GiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=512.00MiB, used=0.00B

-- 


# df -h
/dev/sda4 7,0T  6,8T245G  97% /data

Regards,

For what it's worth...  I have a 8 disk (7x 300GB + 1x 500GB) data 
raid10, metadata raid1 setup and I get the following output of btrfs...


Label: 'xxyyzz'  uuid: 12345678-9abc-def1-2345-6789abcdef01
Total devices 8 FS bytes used 1.05TiB
devid1 size 268.05GiB used 265.94GiB path /dev/sda1
devid2 size 279.40GiB used 277.22GiB path /dev/sdb
devid3 size 279.40GiB used 277.32GiB path /dev/sdc
devid4 size 279.40GiB used 278.73GiB path /dev/sdd
devid5 size 279.40GiB used 277.72GiB path /dev/sde
devid6 size 279.40GiB used 278.61GiB path /dev/sdf
devid7 size 279.40GiB used 278.82GiB path /dev/sdg
devid8 size 465.76GiB used 230.99GiB path /dev/sdh

# btrfs filesystem usage -T /
Overall:
Device size:   2.35TiB
Device allocated:  2.11TiB
Device unallocated:  244.83GiB
Device missing:  0.00B
Used:  2.11TiB
Free (estimated):122.57GiB  (min: 122.57GiB)
Data ratio:   2.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

 Data   Data  Metadata System
Id Path  RAID1  RAID10RAID1RAID1 Unallocated
-- - -- -  - ---
 1 /dev/sda1  - 132.97GiB- - 135.08GiB
 2 /dev/sdb   - 138.61GiB- - 140.79GiB
 3 /dev/sdc   - 138.66GiB- - 140.74GiB
 4 /dev/sdd   - 138.87GiB  1.00GiB - 139.53GiB
 5 /dev/sde 1.00GiB 137.86GiB  1.00GiB - 139.53GiB
 6 /dev/sdf   - 138.81GiB  1.00GiB - 139.59GiB
 7 /dev/sdg 1.00GiB 138.38GiB  1.00GiB  64.00MiB 138.96GiB
 8 /dev/sdh   - 113.46GiB  4.00GiB  64.00MiB 348.24GiB
-- - -- -  - ---
   Total1.00GiB   1.05TiB  4.00GiB  64.00MiB 1.29TiB
Used  1007.30MiB   1.05TiB  1.66GiB 400.00KiB

What I don't get is... how can I have 244.8 GB unallocated when the 
table below clearly shows that there is as much as 1.29TiB 
unallocated does not appear to make sense to me at least...

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs/RAID5 became unmountable after SATA cable fault

2015-11-05 Thread Zoiled

Duncan wrote:

Austin S Hemmelgarn posted on Wed, 04 Nov 2015 13:45:37 -0500 as
excerpted:


On 2015-11-04 13:01, Janos Toth F. wrote:

But the worst part is that there are some ISO files which were
seemingly copied without errors but their external checksums (the one
which I can calculate with md5sum and compare to the one supplied by
the publisher of the ISO file) don't match!
Well... this, I cannot understand.
How could these files become corrupt from a single disk failure? And
more importantly: how could these files be copied without errors? Why
didn't Btrfs gave a read error when the checksums didn't add up?

If you can prove that there was a checksum mismatch and BTRFS returned
invalid data instead of a read error or going to the other disk, then
that is a very serious bug that needs to be fixed.  You need to keep in
mind also however that it's completely possible that the data was bad
before you wrote it to the filesystem, and if that's the case, there's
nothing any filesystem can do to fix it for you.

As Austin suggests, if btrfs is returning data, and you haven't turned
off checksumming with nodatasum or nocow, then it's almost certainly
returning the data it was given to write out in the first place.  Whether
that data it was given to write out was correct, however, is an
/entirely/ different matter.

If ISOs are failing their external checksums, then something is going
on.  Had you verified the external checksums when you first got the
files?  That is, are you sure the files were correct as downloaded and/or
ripped?

Where were the ISOs stored between original procurement/validation and
writing to btrfs?  Is it possible you still have some/all of them on that
media?  Do they still external-checksum-verify there?

Basically, assuming btrfs checksums are validating, there's three other
likely possibilities for where the corruption could have come from before
writing to btrfs.  Either the files were bad as downloaded or otherwise
procured -- which is why I asked whether you verified them upon receipt
-- or you have memory that's going bad, or your temporary storage is
going bad, before the files ever got written to btrfs.

The memory going bad is a particularly worrying possibility,
considering...


Now I am really considering to move from Linux to Windows and from
Btrfs RAID-5 to Storage Spaces RAID-1 + ReFS (the only limitation is
that ReFS is only "self-healing" on RAID-1, not RAID-5, so I need a new
motherboard with more native SATA connectors and an extra HDD). That
one seemed to actually do what it promises (abort any read operations
upon checksum errors [which always happens seamlessly on every read]
but look at the redundant data first and seamlessly "self-heal" if
possible). The only thing which made Btrfs to look as a better
alternative was the RAID-5 support. But I recently experienced two
cases of 1 drive failing of 3 and it always tuned out as a smaller or
bigger disaster (completely lost data or inconsistent data).

Have you considered looking into ZFS?  I hate to suggest it as an
alternative to BTRFS, but it's a much more mature and well tested
technology than ReFS, and has many of the same features as BTRFS (and
even has the option for triple parity instead of the double you get with
RAID6).  If you do consider ZFS, make a point to look at FreeBSD in
addition to the Linux version, the BSD one was a much better written
port of the original Solaris drivers, and has better performance in many
cases (and as much as I hate to admit it, BSD is way more reliable than
Linux in most use cases).

You should also seriously consider whether the convenience of having a
filesystem that fixes internal errors itself with no user intervention
is worth the risk of it corrupting your data.  Returning correct data
whenever possible is one thing, being 'self-healing' is completely
different.  When you start talking about things that automatically fix
internal errors without user intervention is when most seasoned system
administrators start to get really nervous.  Self correcting systems
have just as much chance to make things worse as they do to make things
better, and most of them depend on the underlying hardware working
correctly to actually provide any guarantee of reliability.

I too would point you at ZFS, but there's one VERY BIG caveat, and one
related smaller one!

The people who have a lot of ZFS experience say it's generally quite
reliable, but gobs of **RELIABLE** memory are *absolutely* *critical*!
The self-healing works well, *PROVIDED* memory isn't producing errors.
Absolutely reliable memory is in fact *so* critical, that running ZFS on
non-ECC memory is severely discouraged as a very real risk to your data.

Which is why the above hints that your memory may be bad are so
worrying.  Don't even *THINK* about ZFS, particularly its self-healing
features, if you're not absolutely sure your memory is 100% reliable,
because apparently, based on the comment's I've seen, if it's not, you

Re: Will BTRFS repair or restore data if corrupted?

2012-01-26 Thread Zoiled

Stefan Behrens wrote:

On 1/26/2012 9:59 AM, Hugo Mills wrote:

On Thu, Jan 26, 2012 at 12:27:57AM +0100, Waxhead wrote:

[...]

Will BTRFS try to repair the corrupt data or will it simply silently
restore the data without the user knowing that a file has been
fixed?

No, it'll just return the good copy and report the failure in the
system logs. If you want to fix the corrupt data, you need to use
scrub, which will check everything and fix blocks with failed
checksums.

Since 3.2, btrfs rewrites the corrupt disk block (commit 4a54c8c and
f4a8e65 from Jan Schmidt), even without scrub.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



So if I for example edit a text file three times and store it I can get 
the following.

Version1: I currently like cheese
Version2: I currently like onions
Version3: I currently like apples
As far as I understand a disk corruption might result in me suddenly 
liking onions (or even cheese) instead of apples without any warning 
except in syslog.?! I really hope I have misunderstood the concept and 
that there is some error correction codes somewhere.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Will BTRFS repair or restore data if corrupted?

2012-01-26 Thread Zoiled

cwillu wrote:

So if I for example edit a text file three times and store it I can get the
following.
Version1: I currently like cheese
Version2: I currently like onions
Version3: I currently like apples
As far as I understand a disk corruption might result in me suddenly liking
onions (or even cheese) instead of apples without any warning except in
syslog.?! I really hope I have misunderstood the concept and that there is
some error correction codes somewhere.

Yes, you've completely misunderstood the concept :p

There are crc's on each 4k block of data; if one copy fails the
checksum, and a second copy is available, and that copy does match,
then the good data will be returned and btrfs will overwrite the
corrupted copy with the good copy.  If there isn't another copy, then
an io error will be returned instead.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




Phew... that sounds better :)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html