Re: compress=lzo safe to use?

Hans van Kranenburg Sun, 11 Sep 2016 13:50:31 -0700

On 09/11/2016 09:48 PM, Martin Steigerwald wrote:
> Am Sonntag, 26. Juni 2016, 13:13:04 CEST schrieb Steven Haigh:
>> On 26/06/16 12:30, Duncan wrote:
>>> Steven Haigh posted on Sun, 26 Jun 2016 02:39:23 +1000 as excerpted:
>>>> In every case, it was a flurry of csum error messages, then instant
>>>> death.
>>>
>>> This is very possibly a known bug in btrfs, that occurs even in raid1
>>> where a later scrub repairs all csum errors.  While in theory btrfs raid1
>>> should simply pull from the mirrored copy if its first try fails checksum
>>> (assuming the second one passes, of course), and it seems to do this just
>>> fine if there's only an occasional csum error, if it gets too many at
>>> once, it *does* unfortunately crash [...]


[...]

>>> different, but either way, the whole thing about too many csum errors at
>>> once triggering a system crash sure does sound familiar, here.
>>
>> Yes, I was running the compress=lzo option as well... Maybe here lays a
>> common problem?
> 
> Hmm… I found this from being referred to by reading Debian wiki page on 
> BTRFS¹.
> 
> I use compress=lzo on BTRFS RAID 1 since April 2014 and I never found an 
> issue. Steven, your filesystem wasn´t RAID 1 but RAID 5 or 6?

To quote you from the "stability a joke" thread (which I guess this
might be related to)... "For me so far even compress=lzo seems to be
stable, but well for others it may not."

So, you can use a lot of compress without problems for years.

Only if your hardware is starting to break in a specific way, causing
lots and lots of checksum errors, the kernel might not be able to handle
all of them at the same time currently.

The compress might be super stable itself, but in this case another part
of the filesystem is not perfecty able to handle certain failure
scenario's involving it.

Another way to find out about "are there issues with compression" is
looking in the kernel git history.

When searching for "compression" and "corruption", you'll find fixes
like these:

commit 0305cd5f7fca85dae392b9ba85b116896eb7c1c7
Author: Filipe Manana <fdman...@suse.com>
Date:   Fri Oct 16 12:34:25 2015 +0100

    Btrfs: fix truncation of compressed and inlined extents

commit 808f80b46790f27e145c72112189d6a3be2bc884
Author: Filipe Manana <fdman...@suse.com>
Date:   Mon Sep 28 09:56:26 2015 +0100

    Btrfs: update fix for read corruption of compressed and shared extents

commit 005efedf2c7d0a270ffbe28d8997b03844f3e3e7
Author: Filipe Manana <fdman...@suse.com>
Date:   Mon Sep 14 09:09:31 2015 +0100

    Btrfs: fix read corruption of compressed and shared extents

commit 619d8c4ef7c5dd346add55da82c9179cd2e3387e
Author: Filipe Manana <fdman...@suse.com>
Date:   Sun May 3 01:56:00 2015 +0100

    Btrfs: incremental send, fix clone operations for compressed extents

These commits fix actual data corruption issues. Still, it might be bugs
that you've never seen, even when using a kernel with these bugs for
years, because they require a certain "nasty sequence of events" to trigger.

But, when using compression you certainly want to have these commits in
the kernel you're running right now. And when the bugs caused
corruption, using a fixed kernel with not retroactively fix the corrupt
data.

Hint: "this was fixed in 4.x.y, so run that version or later" is not
always the only answer here, because you'll see that fixes like these
even show up in kernels like 3.16.y

But maybe I should continue by replying on the joke thread instead of
typing more here.

-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: compress=lzo safe to use?

Reply via email to