Re: compress=lzo safe to use?

2016-09-17 Thread Kai Krakow
Am Mon, 12 Sep 2016 04:36:07 + (UTC)
schrieb Duncan <1i5t5.dun...@cox.net>:

> Again, I once thought all this was just the stage at which btrfs was, 
> until I found out that it doesn't seem to happen if btrfs compression 
> isn't being used.  Something about the way it recovers from checksum 
> errors on compressed data differs from the way it recovers from
> checksum errors on uncompressed data, and there's a bug in the
> compressed data processing path.  But beyond that, I'm not a dev and
> it gets a bit fuzzy, which also explains why I've not gone code
> diving and submitted patches to try to fix it, myself.

I suspect that may very well come from the decompression routine which
crashes - and not from btrfs itself. So essentially, the decompression
needs to be fixed instead (which probably slows it down by factors).

Only when this is tested and fixed, one should look into why btrfs
fails when decompression fails.

-- 
Regards,
Kai

Replies to list-only preferred.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: compress=lzo safe to use?

2016-09-11 Thread Duncan
Hans van Kranenburg posted on Sun, 11 Sep 2016 22:49:58 +0200 as
excerpted:

> So, you can use a lot of compress without problems for years.
> 
> Only if your hardware is starting to break in a specific way, causing
> lots and lots of checksum errors, the kernel might not be able to handle
> all of them at the same time currently.
> 
> The compress might be super stable itself, but in this case another part
> of the filesystem is not perfecty able to handle certain failure
> scenario's involving it.

Well put.

In my case I had problems trigger due to exactly two things, tho there 
are obviously other ways of triggering the same issues, including a crash 
in the middle of a commit, with one copy of the raid1 already updated 
while the other is still being written.:

1) I first discovered the problem when one of my pair of ssds was going 
bad.  Because I had btrfs raid1 and could normally scrub-fix things, and 
because I had backups anyway, I chose to continue running it for some 
time, just to see how it handled things, as more and more sectors became 
unwritable and were replaced by spares.  By the end I had several MiB 
worth of spares in-use, altho smart reported I had only used about 15% of 
the available spares, but by then it was getting bad enough and the 
newness had worn off, so I just replaced it and got rid of the hassle.

But as a result of the above, I had a *LOT* of practice with btrfs 
recovery, mostly running scrub.

And what I found was that if btrfs raid1 encounters too many checksum 
errors in compressed data it will crash btrfs and the kernel, even when 
it *SHOULD* recover from the other device because it has a good copy, as 
demonstrated by the fact that after a reboot, I could run a scrub and fix 
everything, no uncorrected errors at all.

At first I thought it was just the way btrfs worked -- that it could 
handle a few checksum errors but not too many at once.  I had no idea it 
was compression related.  But nobody else seemed to mention the problem, 
which I though a bit strange, until someone /did/ mention it, and 
furthermore, actually tested both compressed and uncompressed btrfs, and 
found the problem only when btrfs was reading compressed data.  If the 
data wasn't compressed, btrfs went ahead and read the second copy 
correctly, without crashing the system, every time.

The extra kink in this is that at the time, I had a boot-time service 
setup to cache (via cat > /dev/null) a bunch of files in a particular 
directory.  This particular directory is a cache for news archives, with 
articles on some groups going back over a decade to 2002, and my news 
client (pan) is slow to startup with several gigs of cached messages like 
that, so I had the boot-time service pre-cache everything, so by the time 
I started X and pan, it would be done or nearly so and I'd not have to 
wait for pan to startup.

The problem was that many of the new files were in this directory, and 
all that activity tended to hit the going-bad sectors on that ssd rather 
frequently, making one copy often bad.  Additionally, these are mostly 
text messages, so they compress quite well, meaning compress=lzo would 
trigger compression on many of them.

And because I had it reading them at boot, the kernel tended to overload 
on checksum errors before it finished booting, far more frequently than 
it would have otherwise.  Of course, that would crash the system before I 
could get a login in ordered to run btrfs scrub and fix the problem.

What I had to do then was boot to rescue mode, with the filesystems 
mounted but before normal services (including this caching service) ran, 
run the scrub from there, and then continue boot, which would then work 
just fine because I'd fixed all the checksum errors.

But, as I said I eventually got tired of the hassle and just replaced the 
failing device.  Btrfs replace worked nicely. =:^)

2a) My second trigger is that I've found that with multiple devices, as 
in multi-device btrfs, but also when I used to run mdraid, don't always 
resume from suspend-to-RAM very well.  Often one device takes longer to 
wake up than the other(s), and the kernel will try to resume while one 
still isn't responding properly.  (FWIW, I ran into this problem on 
spinning rust back on mdraid, but I see it now on ssds on btrfs as well, 
so it seems to be a common issue, which probably remains relatively 
obscure I'd guess because relatively few people with multi-device btrfs 
or mdraid do suspend-to-ram.)

The result is that btrfs will try to write to the remaining device(s), 
getting them out of sync with the one that isn't responding properly 
yet.  Ultimately this leads to a crash if I don't catch it and complete a 
controlled shutdown before that, and sometimes I see the same crash-on-
boot-due-to-too-many-checksum-errors problem I saw with #1.  I no longer 
have that caching job running at boot and thus don't see it as often, but 
it still happens occasionally.  Again, once I boot to resc

Re: compress=lzo safe to use?

2016-09-11 Thread Steven Haigh

On 2016-09-12 05:48, Martin Steigerwald wrote:

Am Sonntag, 26. Juni 2016, 13:13:04 CEST schrieb Steven Haigh:

On 26/06/16 12:30, Duncan wrote:
> Steven Haigh posted on Sun, 26 Jun 2016 02:39:23 +1000 as excerpted:
>> In every case, it was a flurry of csum error messages, then instant
>> death.
>
> This is very possibly a known bug in btrfs, that occurs even in raid1
> where a later scrub repairs all csum errors.  While in theory btrfs raid1
> should simply pull from the mirrored copy if its first try fails checksum
> (assuming the second one passes, of course), and it seems to do this just
> fine if there's only an occasional csum error, if it gets too many at
> once, it *does* unfortunately crash, despite the second copy being
> available and being just fine as later demonstrated by the scrub fixing
> the bad copy from the good one.
>
> I'm used to dealing with that here any time I have a bad shutdown (and
> I'm running live-git kde, which currently has a bug that triggers a
> system crash if I let it idle and shut off the monitors, so I've been
> getting crash shutdowns and having to deal with this unfortunately often,
> recently).  Fortunately I keep my root, with all system executables, etc,
> mounted read-only by default, so it's not affected and I can /almost/
> boot normally after such a crash.  The problem is /var/log and /home
> (which has some parts of /var that need to be writable symlinked into /
> home/var, so / can stay read-only).  Something in the normal after-crash
> boot triggers enough csum errors there that I often crash again.
>
> So I have to boot to emergency mode and manually mount the filesystems in
> question, so nothing's trying to access them until I run the scrub and
> fix the csum errors.  Scrub itself doesn't trigger the crash, thankfully,
> and once it has repaired all the csum errors due to partial writes on one
> mirror that either were never made or were properly completed on the
> other mirror, I can exit emergency mode and complete the normal boot (to
> the multi-user default target).  As there's no more csum errors then
> because scrub fixed them all, the boot doesn't crash due to too many such
> errors, and I'm back in business.
>
>
> Tho I believe at least the csum bug that affects me may only trigger if
> compression is (or perhaps has been in the past) enabled.  Since I run
> compress=lzo everywhere, that would certainly affect me.  It would also
> explain why the bug has remained around for quite some time as well,
> since presumably the devs don't run with compression on enough for this
> to have become a personal itch they needed to scratch, thus its remaining
> untraced and unfixed.
>
> So if you weren't using the compress option, your bug is probably
> different, but either way, the whole thing about too many csum errors at
> once triggering a system crash sure does sound familiar, here.

Yes, I was running the compress=lzo option as well... Maybe here lays 
a

common problem?


Hmm… I found this from being referred to by reading Debian wiki page on
BTRFS¹.

I use compress=lzo on BTRFS RAID 1 since April 2014 and I never found 
an

issue. Steven, your filesystem wasn´t RAID 1 but RAID 5 or 6?


Yes, I was using RAID6 - and it has had a track record of eating data. 
There's lots of problems with the implementation / correctness of 
RAID5/6 parity - which I'm pretty sure haven't been nailed down yet. The 
recommendation at the moment is just not to use RAID5 or RAID6 modes of 
BTRFS. The last I heard, if you were using RAID5/6 in BTRFS, the 
recommended action was to migrate your data to a different profile or a 
different FS.


I just want to assess whether using compress=lzo might be dangerous to 
use in
my setup. Actually right now I like to keep using it, since I think at 
least
one of the SSDs does not compress. And… well… /home and / where I use 
it are

both quite full already.


I don't believe the compress=lzo option by itself was a problem - but it 
*may* have an impact in the RAID5/6 parity problems? I'd be guessing 
here, but am happy to be corrected.


--
Steven Haigh

Email: net...@crc.id.au
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: compress=lzo safe to use?

2016-09-11 Thread Hans van Kranenburg
On 09/11/2016 09:48 PM, Martin Steigerwald wrote:
> Am Sonntag, 26. Juni 2016, 13:13:04 CEST schrieb Steven Haigh:
>> On 26/06/16 12:30, Duncan wrote:
>>> Steven Haigh posted on Sun, 26 Jun 2016 02:39:23 +1000 as excerpted:
 In every case, it was a flurry of csum error messages, then instant
 death.
>>>
>>> This is very possibly a known bug in btrfs, that occurs even in raid1
>>> where a later scrub repairs all csum errors.  While in theory btrfs raid1
>>> should simply pull from the mirrored copy if its first try fails checksum
>>> (assuming the second one passes, of course), and it seems to do this just
>>> fine if there's only an occasional csum error, if it gets too many at
>>> once, it *does* unfortunately crash [...]

[...]

>>> different, but either way, the whole thing about too many csum errors at
>>> once triggering a system crash sure does sound familiar, here.
>>
>> Yes, I was running the compress=lzo option as well... Maybe here lays a
>> common problem?
> 
> Hmm… I found this from being referred to by reading Debian wiki page on 
> BTRFS¹.
> 
> I use compress=lzo on BTRFS RAID 1 since April 2014 and I never found an 
> issue. Steven, your filesystem wasn´t RAID 1 but RAID 5 or 6?

To quote you from the "stability a joke" thread (which I guess this
might be related to)... "For me so far even compress=lzo seems to be
stable, but well for others it may not."

So, you can use a lot of compress without problems for years.

Only if your hardware is starting to break in a specific way, causing
lots and lots of checksum errors, the kernel might not be able to handle
all of them at the same time currently.

The compress might be super stable itself, but in this case another part
of the filesystem is not perfecty able to handle certain failure
scenario's involving it.

Another way to find out about "are there issues with compression" is
looking in the kernel git history.

When searching for "compression" and "corruption", you'll find fixes
like these:

commit 0305cd5f7fca85dae392b9ba85b116896eb7c1c7
Author: Filipe Manana 
Date:   Fri Oct 16 12:34:25 2015 +0100

Btrfs: fix truncation of compressed and inlined extents

commit 808f80b46790f27e145c72112189d6a3be2bc884
Author: Filipe Manana 
Date:   Mon Sep 28 09:56:26 2015 +0100

Btrfs: update fix for read corruption of compressed and shared extents

commit 005efedf2c7d0a270ffbe28d8997b03844f3e3e7
Author: Filipe Manana 
Date:   Mon Sep 14 09:09:31 2015 +0100

Btrfs: fix read corruption of compressed and shared extents

commit 619d8c4ef7c5dd346add55da82c9179cd2e3387e
Author: Filipe Manana 
Date:   Sun May 3 01:56:00 2015 +0100

Btrfs: incremental send, fix clone operations for compressed extents

These commits fix actual data corruption issues. Still, it might be bugs
that you've never seen, even when using a kernel with these bugs for
years, because they require a certain "nasty sequence of events" to trigger.

But, when using compression you certainly want to have these commits in
the kernel you're running right now. And when the bugs caused
corruption, using a fixed kernel with not retroactively fix the corrupt
data.

Hint: "this was fixed in 4.x.y, so run that version or later" is not
always the only answer here, because you'll see that fixes like these
even show up in kernels like 3.16.y

But maybe I should continue by replying on the joke thread instead of
typing more here.

-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: compress=lzo safe to use? (was: Re: Trying to rescue my data :()

2016-09-11 Thread Chris Murphy
On Sun, Sep 11, 2016 at 2:06 PM, Adam Borowski  wrote:
> On Sun, Sep 11, 2016 at 09:48:35PM +0200, Martin Steigerwald wrote:
>> Hmm… I found this from being referred to by reading Debian wiki page on
>> BTRFS¹.
>>
>> I use compress=lzo on BTRFS RAID 1 since April 2014 and I never found an
>> issue. Steven, your filesystem wasn´t RAID 1 but RAID 5 or 6?
>>
>> I just want to assess whether using compress=lzo might be dangerous to use in
>> my setup. Actually right now I like to keep using it, since I think at least
>> one of the SSDs does not compress. And… well… /home and / where I use it are
>> both quite full already.
>>
>> [1] https://wiki.debian.org/Btrfs#WARNINGS
>
> I have used compress=lzo for years, kernels 3.8, 3.13 and 3.14 (a bunch of
> machines), without a single glitch; heavy snapshotting, single dev only, no
> quota.  Until recently I did never balanced.
>
> I did have a case of ENOSPC with <80% full on 4.7 which might or might not
> be related to compress=lzo.

I'm not finding it off hand, but Duncan has some experience with this
issue, where he'd occasionally have some sort of problem (hand wave),
I don't know how serious it was, maybe just scary warnings like a call
trace or something, but no actual problem? My recollection is that
compression might be making certain edge case problems more difficult
to recover from. I don't know why that would be, as metadata itself
isn't compressed (the inline data saved in metadata nodes can be
compressed). But there you go, if things start going wonky compression
might make it more difficult. But that's speculative. And I also don't
know if there's any difference between lzo and zlib in this regard
either.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: compress=lzo safe to use? (was: Re: Trying to rescue my data :()

2016-09-11 Thread Adam Borowski
On Sun, Sep 11, 2016 at 09:48:35PM +0200, Martin Steigerwald wrote:
> Hmm… I found this from being referred to by reading Debian wiki page on 
> BTRFS¹.
> 
> I use compress=lzo on BTRFS RAID 1 since April 2014 and I never found an 
> issue. Steven, your filesystem wasn´t RAID 1 but RAID 5 or 6?
> 
> I just want to assess whether using compress=lzo might be dangerous to use in 
> my setup. Actually right now I like to keep using it, since I think at least 
> one of the SSDs does not compress. And… well… /home and / where I use it are 
> both quite full already.
> 
> [1] https://wiki.debian.org/Btrfs#WARNINGS

I have used compress=lzo for years, kernels 3.8, 3.13 and 3.14 (a bunch of
machines), without a single glitch; heavy snapshotting, single dev only, no
quota.  Until recently I did never balanced.

I did have a case of ENOSPC with <80% full on 4.7 which might or might not
be related to compress=lzo.

-- 
Second "wet cat laying down on a powered-on box-less SoC on the desk" close
shave in a week.  Protect your ARMs, folks!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html