Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Chris Murphy
On Wed, Oct 10, 2018 at 9:07 PM, Larkin Lowrey
 wrote:
> On 10/10/2018 10:51 PM, Chris Murphy wrote:
>>
>> On Wed, Oct 10, 2018 at 8:12 PM, Larkin Lowrey
>>  wrote:
>>>
>>> On 10/10/2018 7:55 PM, Hans van Kranenburg wrote:

 On 10/10/2018 07:44 PM, Chris Murphy wrote:
>
>
> I'm pretty sure you have to umount, and then clear the space_cache
> with 'btrfs check --clear-space-cache=v1' and then do a one time mount
> with -o space_cache=v2.

 The --clear-space-cache=v1 is optional, but recommended, if you are
 someone who do not likes to keep accumulated cruft.

 The v2 mount (rw mount!!!) does not remove the v1 cache. If you just
 mount with v2, the v1 data keeps being there, doing nothing any more.
>>>
>>>
>>> Theoretically I have the v2 space_cache enabled. After a clean umount...
>>>
>>> # mount -onospace_cache /backups
>>> [  391.243175] BTRFS info (device dm-3): disabling free space tree
>>> [  391.249213] BTRFS error (device dm-3): cannot disable free space tree
>>> [  391.255884] BTRFS error (device dm-3): open_ctree failed
>>
>> "free space tree" is the v2 space cache, and once enabled it cannot be
>> disabled with nospace_cache mount option. If you want to run with
>> nospace_cache you'll need to clear it.
>>
>>
>>> # mount -ospace_cache=v1 /backups/
>>> mount: /backups: wrong fs type, bad option, bad superblock on
>>> /dev/mapper/Cached-Backups, missing codepage or helper program, or other
>>> error
>>> [  983.501874] BTRFS info (device dm-3): enabling disk space caching
>>> [  983.508052] BTRFS error (device dm-3): cannot disable free space tree
>>> [  983.514633] BTRFS error (device dm-3): open_ctree failed
>>
>> You cannot go back and forth between v1 and v2. Once v2 is enabled,
>> it's always used regardless of any mount option. You'll need to use
>> btrfs check to clear the v2 cache if you want to use v1 cache.
>>
>>
>>> # btrfs check --clear-space-cache v1 /dev/Cached/Backups
>>> Opening filesystem to check...
>>> couldn't open RDWR because of unsupported option features (3).
>>> ERROR: cannot open file system
>>
>> You're missing the '=' symbol for the clear option, that's why it fails.
>>
>
> # btrfs check --clear-space-cache=v2 /dev/Cached/Backups
> Opening filesystem to check...
> Checking filesystem on /dev/Cached/Backups
> UUID: acff5096-1128-4b24-a15e-4ba04261edc3
> Clear free space cache v2
> Segmentation fault (core dumped)
>
> [  109.686188] btrfs[2429]: segfault at 68 ip 555ff6394b1c sp
> 7ffcc4733ab0 error 4 in btrfs[555ff637c000+ca000]
> [  109.696732] Code: ff e8 68 ed ff ff 8b 4c 24 58 4d 8b 8f c7 01 00 00 4c
> 89 fe 85 c0 0f 44 44 24 40 45 31 c0 89 44 24 40 48 8b 84 24 90 00 00 00 <8b>
> 40 68 49 29 87 d0 00 00 00 6a 00 55 48 8b 54 24 18 48 8b 7c 24
>
> That's btrfs-progs v4.17.1 on 4.18.12-200.fc28.x86_64.
>
> I appreciate the help and advice from everyone who has contributed to this
> thread. At this point, unless there is something for the project to gain
> from tracking down this trouble, I'm just going to nuke the fs and start
> over.

Is this a 68T file system? Seems excessive. For now you should be able
to use the new v2 space tree. I think Qu or some dev will want to know
why you're getting a crash trying to clear the v2 space cache. Maybe
try clearing the v1 first, then v2?  While v1 is default right now,
soonish the plan is to go to v2 by default but the inability to clear
is a bug worth investigation. I've just tried it on several of my file
systems and it clears without error and rebuilds at next mount with v2
option.

If it is the 68T file system, I don't expect a btrfs-image is going to
be easy to capture or deliver: you've got 95GiB of metadata!
Compressed that's still a ~30-45GiB image.


-- 
Chris Murphy


Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Larkin Lowrey

On 10/10/2018 10:51 PM, Chris Murphy wrote:

On Wed, Oct 10, 2018 at 8:12 PM, Larkin Lowrey
 wrote:

On 10/10/2018 7:55 PM, Hans van Kranenburg wrote:

On 10/10/2018 07:44 PM, Chris Murphy wrote:


I'm pretty sure you have to umount, and then clear the space_cache
with 'btrfs check --clear-space-cache=v1' and then do a one time mount
with -o space_cache=v2.

The --clear-space-cache=v1 is optional, but recommended, if you are
someone who do not likes to keep accumulated cruft.

The v2 mount (rw mount!!!) does not remove the v1 cache. If you just
mount with v2, the v1 data keeps being there, doing nothing any more.


Theoretically I have the v2 space_cache enabled. After a clean umount...

# mount -onospace_cache /backups
[  391.243175] BTRFS info (device dm-3): disabling free space tree
[  391.249213] BTRFS error (device dm-3): cannot disable free space tree
[  391.255884] BTRFS error (device dm-3): open_ctree failed

"free space tree" is the v2 space cache, and once enabled it cannot be
disabled with nospace_cache mount option. If you want to run with
nospace_cache you'll need to clear it.



# mount -ospace_cache=v1 /backups/
mount: /backups: wrong fs type, bad option, bad superblock on
/dev/mapper/Cached-Backups, missing codepage or helper program, or other
error
[  983.501874] BTRFS info (device dm-3): enabling disk space caching
[  983.508052] BTRFS error (device dm-3): cannot disable free space tree
[  983.514633] BTRFS error (device dm-3): open_ctree failed

You cannot go back and forth between v1 and v2. Once v2 is enabled,
it's always used regardless of any mount option. You'll need to use
btrfs check to clear the v2 cache if you want to use v1 cache.



# btrfs check --clear-space-cache v1 /dev/Cached/Backups
Opening filesystem to check...
couldn't open RDWR because of unsupported option features (3).
ERROR: cannot open file system

You're missing the '=' symbol for the clear option, that's why it fails.



# btrfs check --clear-space-cache=v2 /dev/Cached/Backups
Opening filesystem to check...
Checking filesystem on /dev/Cached/Backups
UUID: acff5096-1128-4b24-a15e-4ba04261edc3
Clear free space cache v2
Segmentation fault (core dumped)

[  109.686188] btrfs[2429]: segfault at 68 ip 555ff6394b1c sp 
7ffcc4733ab0 error 4 in btrfs[555ff637c000+ca000]
[  109.696732] Code: ff e8 68 ed ff ff 8b 4c 24 58 4d 8b 8f c7 01 00 00 
4c 89 fe 85 c0 0f 44 44 24 40 45 31 c0 89 44 24 40 48 8b 84 24 90 00 00 
00 <8b> 40 68 49 29 87 d0 00 00 00 6a 00 55 48 8b 54 24 18 48 8b 7c 24


That's btrfs-progs v4.17.1 on 4.18.12-200.fc28.x86_64.

I appreciate the help and advice from everyone who has contributed to 
this thread. At this point, unless there is something for the project to 
gain from tracking down this trouble, I'm just going to nuke the fs and 
start over.


--Larkin



Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Chris Murphy
On Wed, Oct 10, 2018 at 8:12 PM, Larkin Lowrey
 wrote:
> On 10/10/2018 7:55 PM, Hans van Kranenburg wrote:
>>
>> On 10/10/2018 07:44 PM, Chris Murphy wrote:
>>>
>>>
>>> I'm pretty sure you have to umount, and then clear the space_cache
>>> with 'btrfs check --clear-space-cache=v1' and then do a one time mount
>>> with -o space_cache=v2.
>>
>> The --clear-space-cache=v1 is optional, but recommended, if you are
>> someone who do not likes to keep accumulated cruft.
>>
>> The v2 mount (rw mount!!!) does not remove the v1 cache. If you just
>> mount with v2, the v1 data keeps being there, doing nothing any more.
>
>
> Theoretically I have the v2 space_cache enabled. After a clean umount...
>
> # mount -onospace_cache /backups
> [  391.243175] BTRFS info (device dm-3): disabling free space tree
> [  391.249213] BTRFS error (device dm-3): cannot disable free space tree
> [  391.255884] BTRFS error (device dm-3): open_ctree failed

"free space tree" is the v2 space cache, and once enabled it cannot be
disabled with nospace_cache mount option. If you want to run with
nospace_cache you'll need to clear it.


>
> # mount -ospace_cache=v1 /backups/
> mount: /backups: wrong fs type, bad option, bad superblock on
> /dev/mapper/Cached-Backups, missing codepage or helper program, or other
> error
> [  983.501874] BTRFS info (device dm-3): enabling disk space caching
> [  983.508052] BTRFS error (device dm-3): cannot disable free space tree
> [  983.514633] BTRFS error (device dm-3): open_ctree failed

You cannot go back and forth between v1 and v2. Once v2 is enabled,
it's always used regardless of any mount option. You'll need to use
btrfs check to clear the v2 cache if you want to use v1 cache.


>
> # btrfs check --clear-space-cache v1 /dev/Cached/Backups
> Opening filesystem to check...
> couldn't open RDWR because of unsupported option features (3).
> ERROR: cannot open file system

You're missing the '=' symbol for the clear option, that's why it fails.




-- 
Chris Murphy


Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Larkin Lowrey

On 10/10/2018 7:55 PM, Hans van Kranenburg wrote:

On 10/10/2018 07:44 PM, Chris Murphy wrote:


I'm pretty sure you have to umount, and then clear the space_cache
with 'btrfs check --clear-space-cache=v1' and then do a one time mount
with -o space_cache=v2.

The --clear-space-cache=v1 is optional, but recommended, if you are
someone who do not likes to keep accumulated cruft.

The v2 mount (rw mount!!!) does not remove the v1 cache. If you just
mount with v2, the v1 data keeps being there, doing nothing any more.


Theoretically I have the v2 space_cache enabled. After a clean umount...

# mount -onospace_cache /backups
[  391.243175] BTRFS info (device dm-3): disabling free space tree
[  391.249213] BTRFS error (device dm-3): cannot disable free space tree
[  391.255884] BTRFS error (device dm-3): open_ctree failed

# mount -ospace_cache=v1 /backups/
mount: /backups: wrong fs type, bad option, bad superblock on 
/dev/mapper/Cached-Backups, missing codepage or helper program, or other 
error

[  983.501874] BTRFS info (device dm-3): enabling disk space caching
[  983.508052] BTRFS error (device dm-3): cannot disable free space tree
[  983.514633] BTRFS error (device dm-3): open_ctree failed

# btrfs check --clear-space-cache v1 /dev/Cached/Backups
Opening filesystem to check...
couldn't open RDWR because of unsupported option features (3).
ERROR: cannot open file system

# btrfs --version
btrfs-progs v4.17.1

# mount /backups/
[ 1036.840637] BTRFS info (device dm-3): using free space tree
[ 1036.846272] BTRFS info (device dm-3): has skinny extents
[ 1036.999456] BTRFS info (device dm-3): bdev /dev/mapper/Cached-Backups 
errs: wr 0, rd 0, flush 0, corrupt 666, gen 25

[ 1043.025076] BTRFS info (device dm-3): enabling ssd optimizations

Backups will run tonight and will beat on the FS. Perhaps if something 
interesting happens I'll have more log data.


--Larkin


Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Hans van Kranenburg
On 10/10/2018 07:44 PM, Chris Murphy wrote:
> On Wed, Oct 10, 2018 at 10:04 AM, Holger Hoffstätte
>  wrote:
>> On 10/10/18 17:44, Larkin Lowrey wrote:
>> (..)
>>>
>>> About once a week, or so, I'm running into the above situation where
>>> FS seems to deadlock. All IO to the FS blocks, there is no IO
>>> activity at all. I have to hard reboot the system to recover. There
>>> are no error indications except for the following which occurs well
>>> before the FS freezes up:
>>>
>>> BTRFS warning (device dm-3): block group 78691883286528 has wrong amount
>>> of free space
>>> BTRFS warning (device dm-3): failed to load free space cache for block
>>> group 78691883286528, rebuilding it now
>>>
>>> Do I have any options other the nuking the FS and starting over?
>>
>>
>> Unmount cleanly & mount again with -o space_cache=v2.
> 
> I'm pretty sure you have to umount, and then clear the space_cache
> with 'btrfs check --clear-space-cache=v1' and then do a one time mount
> with -o space_cache=v2.

The --clear-space-cache=v1 is optional, but recommended, if you are
someone who do not likes to keep accumulated cruft.

The v2 mount (rw mount!!!) does not remove the v1 cache. If you just
mount with v2, the v1 data keeps being there, doing nothing any more.

> But anyway, to me that seems premature because we don't even know
> what's causing the problem.
> 
> a. Freezing means there's a kernel bug. Hands down.
> b. Is it freezing on the rebuild? Or something else?
> c. I think the devs would like to see the output from btrfs-progs
> v4.17.1, 'btrfs check --mode=lowmem' and see if it finds anything, in
> particular something not related to free space cache.
> 
> Rebuilding either version of space cache requires successfully reading
> (and parsing) the extent tree.
> 
> 


-- 
Hans van Kranenburg


Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Qu Wenruo


On 2018/10/11 上午1:25, Larkin Lowrey wrote:
> On 10/10/2018 12:04 PM, Holger Hoffstätte wrote:
>> On 10/10/18 17:44, Larkin Lowrey wrote:
>> (..)
>>> About once a week, or so, I'm running into the above situation where
>>> FS seems to deadlock. All IO to the FS blocks, there is no IO
>>> activity at all. I have to hard reboot the system to recover. There
>>> are no error indications except for the following which occurs well
>>> before the FS freezes up:
>>>
>>> BTRFS warning (device dm-3): block group 78691883286528 has wrong
>>> amount of free space
>>> BTRFS warning (device dm-3): failed to load free space cache for
>>> block group 78691883286528, rebuilding it now
>>>
>>> Do I have any options other the nuking the FS and starting over?
>>
>> Unmount cleanly & mount again with -o space_cache=v2.
> 
> It froze while unmounting. The attached zip is a stack dump captured via
> 'echo t > /proc/sysrq-trigger'. A second attempt after a hard reboot
> worked.

The trace shows it's indeed free space cache write back code causing the
problem.

It may be a deadlock caused by nested tree locks caused by extent
allocator and free space writeback code.

To avoid such problem, you could completely disable v1 free space cache
or goes to v2 cache.

Chris Murphy's guide should be pretty good.


Personally speaking, if your usage is not a performance critical case,
the following things can be disable and avoid possible bugs:

1) free space cache
   It only increase the speed to lookup free space.
2) tree log
   It only speed up fsync() causes. Without it we just falls back to
   sync()

So I'd recommend the following mount option:
nospace_cache,notreelog

Thanks,
Qu

> 
> --Larkin



signature.asc
Description: OpenPGP digital signature


Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Chris Murphy
On Wed, Oct 10, 2018 at 12:31 PM, Larkin Lowrey
 wrote:

> Interesting, because I do not see any indications of any other errors. The
> fs is backed by an mdraid array and the raid checks always pass with no
> mismatches, edac-util doesn't report any ECC errors, smartd doesn't report
> any SMART errors, and I never see any raid controller errors. I have the
> console connected through serial to a logging console server so if there
> were errors reported I would have seen them.

I think Holger is referring to the multiple reports like this:

[  817.883261] scsi_eh_0   S0   141  2 0x8000
[  817.66] Call Trace:
[  817.891391]  ? __schedule+0x253/0x860
[  817.895094]  ? scsi_try_target_reset+0x90/0x90
[  817.899631]  ? scsi_eh_get_sense+0x220/0x220
[  817.904045]  schedule+0x28/0x80
[  817.907260]  scsi_error_handler+0x1d2/0x5b0
[  817.911514]  ? __schedule+0x25b/0x860
[  817.915207]  ? scsi_eh_get_sense+0x220/0x220
[  817.919547]  kthread+0x112/0x130
[  817.922818]  ? kthread_create_worker_on_cpu+0x70/0x70
[  817.928015]  ret_from_fork+0x22/0x40


That isn't a SCSI controller or drive error itself; it's a capture of
a thread that's in the state of handling scsi errors (maybe).

I'm finding scsi_try_target_reset here at line 855
https://github.com/torvalds/linux/blob/master/drivers/scsi/scsi_error.c

And also line 2143 for scsi_error_handler
https://github.com/torvalds/linux/blob/master/drivers/scsi/scsi_error.c

Is the problem Btrfs on sysroot? Because if the sysroot file system is
entirely error free, I'd expect to eventually get a lot more error
information from the kernel even without sysrq+t rather than
faceplanting. Can you post the entire dmesg? The posted one starts at
~815 seconds, and the problems definitely start before then but as it
is we have nothing really to go on.


-- 
Chris Murphy


Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Larkin Lowrey

On 10/10/2018 2:20 PM, Holger Hoffstätte wrote:

On 10/10/18 19:25, Larkin Lowrey wrote:

On 10/10/2018 12:04 PM, Holger Hoffstätte wrote:

On 10/10/18 17:44, Larkin Lowrey wrote:
(..)

About once a week, or so, I'm running into the above situation where
FS seems to deadlock. All IO to the FS blocks, there is no IO
activity at all. I have to hard reboot the system to recover. There
are no error indications except for the following which occurs well
before the FS freezes up:

BTRFS warning (device dm-3): block group 78691883286528 has wrong 
amount of free space
BTRFS warning (device dm-3): failed to load free space cache for 
block group 78691883286528, rebuilding it now


Do I have any options other the nuking the FS and starting over?


Unmount cleanly & mount again with -o space_cache=v2.


It froze while unmounting. The attached zip is a stack dump captured
via 'echo t > /proc/sysrq-trigger'. A second attempt after a hard
reboot worked.


Trace says freespace cache writeout failed midway while the scsi device
was resetting itself and then went rrrghh. Probably managed to hit
different blocks on the second attempt. So chances are your controller,
disk or something else is broken, dying, or both.
When things have settled and you have verified that r/o mounting works
and is stable, try rescuing the data (when necessary) before scrubbing,
dm-device-checking or whatever you have set up.


Interesting, because I do not see any indications of any other errors. 
The fs is backed by an mdraid array and the raid checks always pass with 
no mismatches, edac-util doesn't report any ECC errors, smartd doesn't 
report any SMART errors, and I never see any raid controller errors. I 
have the console connected through serial to a logging console server so 
if there were errors reported I would have seen them.


--Larkin


Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Holger Hoffstätte

On 10/10/18 19:44, Chris Murphy wrote:

On Wed, Oct 10, 2018 at 10:04 AM, Holger Hoffstätte
 wrote:

On 10/10/18 17:44, Larkin Lowrey wrote:
(..)


About once a week, or so, I'm running into the above situation where
FS seems to deadlock. All IO to the FS blocks, there is no IO
activity at all. I have to hard reboot the system to recover. There
are no error indications except for the following which occurs well
before the FS freezes up:

BTRFS warning (device dm-3): block group 78691883286528 has wrong amount
of free space
BTRFS warning (device dm-3): failed to load free space cache for block
group 78691883286528, rebuilding it now

Do I have any options other the nuking the FS and starting over?



Unmount cleanly & mount again with -o space_cache=v2.


I'm pretty sure you have to umount, and then clear the space_cache
with 'btrfs check --clear-space-cache=v1' and then do a one time mount
with -o space_cache=v2.

But anyway, to me that seems premature because we don't even know
what's causing the problem.


Space cache writeout not honoring errors from the depths below
is not unusual, I think there were some fixes recently which Larkin
likely doesn't have yet. But yeah, I forgot to mention that cache-v2
alone won't really fix the _underlying_ symptoms. It is, however,
vastly more reliable in general.


a. Freezing means there's a kernel bug. Hands down.
b. Is it freezing on the rebuild? Or something else?
c. I think the devs would like to see the output from btrfs-progs
v4.17.1, 'btrfs check --mode=lowmem' and see if it finds anything, in
particular something not related to free space cache.


Apart from performance implications, if only the free space cache
inodes/blocks are borked then the rest will (should) work just fine
and/or be replaced/overwritten eventually.

Well, at least that was the idea. :}

-h


Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Holger Hoffstätte

On 10/10/18 19:25, Larkin Lowrey wrote:

On 10/10/2018 12:04 PM, Holger Hoffstätte wrote:

On 10/10/18 17:44, Larkin Lowrey wrote:
(..)

About once a week, or so, I'm running into the above situation where
FS seems to deadlock. All IO to the FS blocks, there is no IO
activity at all. I have to hard reboot the system to recover. There
are no error indications except for the following which occurs well
before the FS freezes up:

BTRFS warning (device dm-3): block group 78691883286528 has wrong amount of 
free space
BTRFS warning (device dm-3): failed to load free space cache for block group 
78691883286528, rebuilding it now

Do I have any options other the nuking the FS and starting over?


Unmount cleanly & mount again with -o space_cache=v2.


It froze while unmounting. The attached zip is a stack dump captured
via 'echo t > /proc/sysrq-trigger'. A second attempt after a hard
reboot worked.


Trace says freespace cache writeout failed midway while the scsi device
was resetting itself and then went rrrghh. Probably managed to hit
different blocks on the second attempt. So chances are your controller,
disk or something else is broken, dying, or both.
When things have settled and you have verified that r/o mounting works
and is stable, try rescuing the data (when necessary) before scrubbing,
dm-device-checking or whatever you have set up.

-h


Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Chris Murphy
On Wed, Oct 10, 2018 at 10:04 AM, Holger Hoffstätte
 wrote:
> On 10/10/18 17:44, Larkin Lowrey wrote:
> (..)
>>
>> About once a week, or so, I'm running into the above situation where
>> FS seems to deadlock. All IO to the FS blocks, there is no IO
>> activity at all. I have to hard reboot the system to recover. There
>> are no error indications except for the following which occurs well
>> before the FS freezes up:
>>
>> BTRFS warning (device dm-3): block group 78691883286528 has wrong amount
>> of free space
>> BTRFS warning (device dm-3): failed to load free space cache for block
>> group 78691883286528, rebuilding it now
>>
>> Do I have any options other the nuking the FS and starting over?
>
>
> Unmount cleanly & mount again with -o space_cache=v2.

I'm pretty sure you have to umount, and then clear the space_cache
with 'btrfs check --clear-space-cache=v1' and then do a one time mount
with -o space_cache=v2.

But anyway, to me that seems premature because we don't even know
what's causing the problem.

a. Freezing means there's a kernel bug. Hands down.
b. Is it freezing on the rebuild? Or something else?
c. I think the devs would like to see the output from btrfs-progs
v4.17.1, 'btrfs check --mode=lowmem' and see if it finds anything, in
particular something not related to free space cache.

Rebuilding either version of space cache requires successfully reading
(and parsing) the extent tree.


-- 
Chris Murphy


Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Larkin Lowrey

On 10/10/2018 12:04 PM, Holger Hoffstätte wrote:

On 10/10/18 17:44, Larkin Lowrey wrote:
(..)

About once a week, or so, I'm running into the above situation where
FS seems to deadlock. All IO to the FS blocks, there is no IO
activity at all. I have to hard reboot the system to recover. There
are no error indications except for the following which occurs well
before the FS freezes up:

BTRFS warning (device dm-3): block group 78691883286528 has wrong 
amount of free space
BTRFS warning (device dm-3): failed to load free space cache for 
block group 78691883286528, rebuilding it now


Do I have any options other the nuking the FS and starting over?


Unmount cleanly & mount again with -o space_cache=v2.


It froze while unmounting. The attached zip is a stack dump captured via 
'echo t > /proc/sysrq-trigger'. A second attempt after a hard reboot worked.


--Larkin
<>


Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Holger Hoffstätte

On 10/10/18 17:44, Larkin Lowrey wrote:
(..)

About once a week, or so, I'm running into the above situation where
FS seems to deadlock. All IO to the FS blocks, there is no IO
activity at all. I have to hard reboot the system to recover. There
are no error indications except for the following which occurs well
before the FS freezes up:

BTRFS warning (device dm-3): block group 78691883286528 has wrong amount of 
free space
BTRFS warning (device dm-3): failed to load free space cache for block group 
78691883286528, rebuilding it now

Do I have any options other the nuking the FS and starting over?


Unmount cleanly & mount again with -o space_cache=v2.

-h


Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Larkin Lowrey

On 9/11/2018 11:23 AM, Larkin Lowrey wrote:

On 8/29/2018 1:32 AM, Qu Wenruo wrote:


On 2018/8/28 下午9:56, Chris Murphy wrote:
On Tue, Aug 28, 2018 at 7:42 AM, Qu Wenruo  
wrote:


On 2018/8/28 下午9:29, Larkin Lowrey wrote:

On 8/27/2018 10:12 PM, Larkin Lowrey wrote:

On 8/27/2018 12:46 AM, Qu Wenruo wrote:
The system uses ECC memory and edac-util has not reported any 
errors.

However, I will run a memtest anyway.

So it should not be the memory problem.

BTW, what's the current generation of the fs?

# btrfs inspect dump-super  | grep generation

The corrupted leaf has generation 2862, I'm not sure how recent 
did the

corruption happen.

generation  358392
chunk_root_generation   357256
cache_generation    358392
uuid_tree_generation    358392
dev_item.generation 0

I don't recall the last time I ran a scrub but I doubt it has been
more than a year.

I am running 'btrfs check --init-csum-tree' now. Hopefully that 
clears

everything up.

No such luck:

Creating a new CRC tree
Checking filesystem on /dev/Cached/Backups
UUID: acff5096-1128-4b24-a15e-4ba04261edc3
Reinitialize checksum tree
csum result is 0 for block 2412149436416
extent-tree.c:2764: alloc_tree_block: BUG_ON `ret` triggered, 
value -28
It's ENOSPC, meaning btrfs can't find enough space for the new csum 
tree

blocks.

Seems bogus, there's >4TiB unallocated.

What a shame.
Btrfs won't try to allocate new chunk if we're allocating new tree
blocks for metadata trees (extent, csum, etc).

One quick (and dirty) way to avoid such limitation is to use the
following patch



<>


No luck.

# ./btrfs check --init-csum-tree /dev/Cached/Backups
Creating a new CRC tree
Opening filesystem to check...
Checking filesystem on /dev/Cached/Backups
UUID: acff5096-1128-4b24-a15e-4ba04261edc3
Reinitialize checksum tree
Segmentation fault (core dumped)

 btrfs[16575]: segfault at 7ffc4f74ef60 ip 0040d4c3 sp 
7ffc4f74ef50 error 6 in btrfs[40+bf000]


# ./btrfs --version
btrfs-progs v4.17.1

I cloned  btrfs-progs from git and applied your patch.

BTW, I've been having tons of trouble with two hosts after updating 
from kernel 4.17.12 to 4.17.14 and beyond. The fs will become 
unresponsive and all processes will end up stuck waiting on io. The 
system will end up totally idle but unable perform any io on the 
filesystem. So far things have been stable after reverting back to 
4.17.12. It looks like there was a btrfs change in 4.17.13. Could that 
be related to this csum tree corruption?


About once a week, or so, I'm running into the above situation where FS 
seems to deadlock. All IO to the FS blocks, there is no IO activity at 
all. I have to hard reboot the system to recover. There are no error 
indications except for the following which occurs well before the FS 
freezes up:


BTRFS warning (device dm-3): block group 78691883286528 has wrong amount 
of free space
BTRFS warning (device dm-3): failed to load free space cache for block 
group 78691883286528, rebuilding it now


Do I have any options other the nuking the FS and starting over?

--Larkin


Re: Scrub aborts due to corrupt leaf

2018-09-11 Thread Larkin Lowrey

On 8/29/2018 1:32 AM, Qu Wenruo wrote:


On 2018/8/28 下午9:56, Chris Murphy wrote:

On Tue, Aug 28, 2018 at 7:42 AM, Qu Wenruo  wrote:


On 2018/8/28 下午9:29, Larkin Lowrey wrote:

On 8/27/2018 10:12 PM, Larkin Lowrey wrote:

On 8/27/2018 12:46 AM, Qu Wenruo wrote:

The system uses ECC memory and edac-util has not reported any errors.
However, I will run a memtest anyway.

So it should not be the memory problem.

BTW, what's the current generation of the fs?

# btrfs inspect dump-super  | grep generation

The corrupted leaf has generation 2862, I'm not sure how recent did the
corruption happen.

generation  358392
chunk_root_generation   357256
cache_generation358392
uuid_tree_generation358392
dev_item.generation 0

I don't recall the last time I ran a scrub but I doubt it has been
more than a year.

I am running 'btrfs check --init-csum-tree' now. Hopefully that clears
everything up.

No such luck:

Creating a new CRC tree
Checking filesystem on /dev/Cached/Backups
UUID: acff5096-1128-4b24-a15e-4ba04261edc3
Reinitialize checksum tree
csum result is 0 for block 2412149436416
extent-tree.c:2764: alloc_tree_block: BUG_ON `ret` triggered, value -28

It's ENOSPC, meaning btrfs can't find enough space for the new csum tree
blocks.

Seems bogus, there's >4TiB unallocated.

What a shame.
Btrfs won't try to allocate new chunk if we're allocating new tree
blocks for metadata trees (extent, csum, etc).

One quick (and dirty) way to avoid such limitation is to use the
following patch



<>


No luck.

# ./btrfs check --init-csum-tree /dev/Cached/Backups
Creating a new CRC tree
Opening filesystem to check...
Checking filesystem on /dev/Cached/Backups
UUID: acff5096-1128-4b24-a15e-4ba04261edc3
Reinitialize checksum tree
Segmentation fault (core dumped)

 btrfs[16575]: segfault at 7ffc4f74ef60 ip 0040d4c3 sp 
7ffc4f74ef50 error 6 in btrfs[40+bf000]


# ./btrfs --version
btrfs-progs v4.17.1

I cloned  btrfs-progs from git and applied your patch.

BTW, I've been having tons of trouble with two hosts after updating from 
kernel 4.17.12 to 4.17.14 and beyond. The fs will become unresponsive 
and all processes will end up stuck waiting on io. The system will end 
up totally idle but unable perform any io on the filesystem. So far 
things have been stable after reverting back to 4.17.12. It looks like 
there was a btrfs change in 4.17.13. Could that be related to this csum 
tree corruption?


--Larkin



Re: Scrub aborts due to corrupt leaf

2018-08-28 Thread Qu Wenruo


On 2018/8/28 下午9:56, Chris Murphy wrote:
> On Tue, Aug 28, 2018 at 7:42 AM, Qu Wenruo  wrote:
>>
>>
>> On 2018/8/28 下午9:29, Larkin Lowrey wrote:
>>> On 8/27/2018 10:12 PM, Larkin Lowrey wrote:
 On 8/27/2018 12:46 AM, Qu Wenruo wrote:
>
>> The system uses ECC memory and edac-util has not reported any errors.
>> However, I will run a memtest anyway.
> So it should not be the memory problem.
>
> BTW, what's the current generation of the fs?
>
> # btrfs inspect dump-super  | grep generation
>
> The corrupted leaf has generation 2862, I'm not sure how recent did the
> corruption happen.

 generation  358392
 chunk_root_generation   357256
 cache_generation358392
 uuid_tree_generation358392
 dev_item.generation 0

 I don't recall the last time I ran a scrub but I doubt it has been
 more than a year.

 I am running 'btrfs check --init-csum-tree' now. Hopefully that clears
 everything up.
>>>
>>> No such luck:
>>>
>>> Creating a new CRC tree
>>> Checking filesystem on /dev/Cached/Backups
>>> UUID: acff5096-1128-4b24-a15e-4ba04261edc3
>>> Reinitialize checksum tree
>>> csum result is 0 for block 2412149436416
>>> extent-tree.c:2764: alloc_tree_block: BUG_ON `ret` triggered, value -28
>>
>> It's ENOSPC, meaning btrfs can't find enough space for the new csum tree
>> blocks.
> 
> Seems bogus, there's >4TiB unallocated.

What a shame.
Btrfs won't try to allocate new chunk if we're allocating new tree
blocks for metadata trees (extent, csum, etc).

One quick (and dirty) way to avoid such limitation is to use the
following patch
--
diff --git a/extent-tree.c b/extent-tree.c
index 5d49af5a901e..0a1d21a8d148 100644
--- a/extent-tree.c
+++ b/extent-tree.c
@@ -2652,17 +2652,15 @@ int btrfs_reserve_extent(struct
btrfs_trans_handle *trans,
profile = BTRFS_BLOCK_GROUP_METADATA | alloc_profile;
}

-   if (root->ref_cows) {
-   if (!(profile & BTRFS_BLOCK_GROUP_METADATA)) {
-   ret = do_chunk_alloc(trans, info,
-num_bytes,
-BTRFS_BLOCK_GROUP_METADATA);
-   BUG_ON(ret);
-   }
+   if (!(profile & BTRFS_BLOCK_GROUP_METADATA)) {
ret = do_chunk_alloc(trans, info,
-num_bytes + SZ_2M, profile);
+num_bytes,
+BTRFS_BLOCK_GROUP_METADATA);
BUG_ON(ret);
}
+   ret = do_chunk_alloc(trans, info,
+num_bytes + SZ_2M, profile);
+   BUG_ON(ret);

WARN_ON(num_bytes < info->sectorsize);
ret = find_free_extent(trans, root, num_bytes, empty_size,
--

Thanks,
Qu

> 
>> Label: none  uuid: acff5096-1128-4b24-a15e-4ba04261edc3
>>Total devices 1 FS bytes used 66.61TiB
>>devid1 size 72.77TiB used 68.03TiB path /dev/mapper/Cached-Backups
>>
>> Data, single: total=67.80TiB, used=66.52TiB
>> System, DUP: total=40.00MiB, used=7.41MiB
>> Metadata, DUP: total=98.50GiB, used=95.21GiB
>> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> Even if all metadata is only csum tree, and ~200GiB needs to be
> written, there's plenty of free space for it.
> 
> 
> 



signature.asc
Description: OpenPGP digital signature


Re: Scrub aborts due to corrupt leaf

2018-08-28 Thread Qu Wenruo


On 2018/8/28 下午9:56, Chris Murphy wrote:
> On Tue, Aug 28, 2018 at 7:42 AM, Qu Wenruo  wrote:
>>
>>
>> On 2018/8/28 下午9:29, Larkin Lowrey wrote:
>>> On 8/27/2018 10:12 PM, Larkin Lowrey wrote:
 On 8/27/2018 12:46 AM, Qu Wenruo wrote:
>
>> The system uses ECC memory and edac-util has not reported any errors.
>> However, I will run a memtest anyway.
> So it should not be the memory problem.
>
> BTW, what's the current generation of the fs?
>
> # btrfs inspect dump-super  | grep generation
>
> The corrupted leaf has generation 2862, I'm not sure how recent did the
> corruption happen.

 generation  358392
 chunk_root_generation   357256
 cache_generation358392
 uuid_tree_generation358392
 dev_item.generation 0

 I don't recall the last time I ran a scrub but I doubt it has been
 more than a year.

 I am running 'btrfs check --init-csum-tree' now. Hopefully that clears
 everything up.
>>>
>>> No such luck:
>>>
>>> Creating a new CRC tree
>>> Checking filesystem on /dev/Cached/Backups
>>> UUID: acff5096-1128-4b24-a15e-4ba04261edc3
>>> Reinitialize checksum tree
>>> csum result is 0 for block 2412149436416
>>> extent-tree.c:2764: alloc_tree_block: BUG_ON `ret` triggered, value -28
>>
>> It's ENOSPC, meaning btrfs can't find enough space for the new csum tree
>> blocks.
> 
> Seems bogus, there's >4TiB unallocated.

Pretty strange.

This either means chunk allocator doesn't work or we have something else
wrong.

I'll take a look into this problem.

Thanks,
Qu

> 
>> Label: none  uuid: acff5096-1128-4b24-a15e-4ba04261edc3
>>Total devices 1 FS bytes used 66.61TiB
>>devid1 size 72.77TiB used 68.03TiB path /dev/mapper/Cached-Backups
>>
>> Data, single: total=67.80TiB, used=66.52TiB
>> System, DUP: total=40.00MiB, used=7.41MiB
>> Metadata, DUP: total=98.50GiB, used=95.21GiB
>> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> Even if all metadata is only csum tree, and ~200GiB needs to be
> written, there's plenty of free space for it.
> 
> 
> 



signature.asc
Description: OpenPGP digital signature


Re: Scrub aborts due to corrupt leaf

2018-08-28 Thread Chris Murphy
On Tue, Aug 28, 2018 at 7:42 AM, Qu Wenruo  wrote:
>
>
> On 2018/8/28 下午9:29, Larkin Lowrey wrote:
>> On 8/27/2018 10:12 PM, Larkin Lowrey wrote:
>>> On 8/27/2018 12:46 AM, Qu Wenruo wrote:

> The system uses ECC memory and edac-util has not reported any errors.
> However, I will run a memtest anyway.
 So it should not be the memory problem.

 BTW, what's the current generation of the fs?

 # btrfs inspect dump-super  | grep generation

 The corrupted leaf has generation 2862, I'm not sure how recent did the
 corruption happen.
>>>
>>> generation  358392
>>> chunk_root_generation   357256
>>> cache_generation358392
>>> uuid_tree_generation358392
>>> dev_item.generation 0
>>>
>>> I don't recall the last time I ran a scrub but I doubt it has been
>>> more than a year.
>>>
>>> I am running 'btrfs check --init-csum-tree' now. Hopefully that clears
>>> everything up.
>>
>> No such luck:
>>
>> Creating a new CRC tree
>> Checking filesystem on /dev/Cached/Backups
>> UUID: acff5096-1128-4b24-a15e-4ba04261edc3
>> Reinitialize checksum tree
>> csum result is 0 for block 2412149436416
>> extent-tree.c:2764: alloc_tree_block: BUG_ON `ret` triggered, value -28
>
> It's ENOSPC, meaning btrfs can't find enough space for the new csum tree
> blocks.

Seems bogus, there's >4TiB unallocated.

>Label: none  uuid: acff5096-1128-4b24-a15e-4ba04261edc3
>Total devices 1 FS bytes used 66.61TiB
>devid1 size 72.77TiB used 68.03TiB path /dev/mapper/Cached-Backups
>
>Data, single: total=67.80TiB, used=66.52TiB
>System, DUP: total=40.00MiB, used=7.41MiB
>Metadata, DUP: total=98.50GiB, used=95.21GiB
>GlobalReserve, single: total=512.00MiB, used=0.00B

Even if all metadata is only csum tree, and ~200GiB needs to be
written, there's plenty of free space for it.



-- 
Chris Murphy


Re: Scrub aborts due to corrupt leaf

2018-08-28 Thread Qu Wenruo


On 2018/8/28 下午9:29, Larkin Lowrey wrote:
> On 8/27/2018 10:12 PM, Larkin Lowrey wrote:
>> On 8/27/2018 12:46 AM, Qu Wenruo wrote:
>>>
 The system uses ECC memory and edac-util has not reported any errors.
 However, I will run a memtest anyway.
>>> So it should not be the memory problem.
>>>
>>> BTW, what's the current generation of the fs?
>>>
>>> # btrfs inspect dump-super  | grep generation
>>>
>>> The corrupted leaf has generation 2862, I'm not sure how recent did the
>>> corruption happen.
>>
>> generation  358392
>> chunk_root_generation   357256
>> cache_generation    358392
>> uuid_tree_generation    358392
>> dev_item.generation 0
>>
>> I don't recall the last time I ran a scrub but I doubt it has been
>> more than a year.
>>
>> I am running 'btrfs check --init-csum-tree' now. Hopefully that clears
>> everything up.
> 
> No such luck:
> 
> Creating a new CRC tree
> Checking filesystem on /dev/Cached/Backups
> UUID: acff5096-1128-4b24-a15e-4ba04261edc3
> Reinitialize checksum tree
> csum result is 0 for block 2412149436416
> extent-tree.c:2764: alloc_tree_block: BUG_ON `ret` triggered, value -28

It's ENOSPC, meaning btrfs can't find enough space for the new csum tree
blocks.

I could try to enhance the behavior, from current one to delete tree
blocks first and then refill.

But this needs some extra time to implement.

BTW, from the line number, it's not the latest btrfs-progs.

Thanks,
Qu

> btrfs(+0x1da16)[0x55cc43796a16]
> btrfs(btrfs_alloc_free_block+0x207)[0x55cc4379c177]
> btrfs(+0x1602f)[0x55cc4378f02f]
> btrfs(btrfs_search_slot+0xed2)[0x55cc43790be2]
> btrfs(btrfs_csum_file_block+0x48f)[0x55cc437a213f]
> btrfs(+0x55cef)[0x55cc437cecef]
> btrfs(cmd_check+0xd49)[0x55cc437ddbc9]
> btrfs(main+0x81)[0x55cc4378b4d1]
> /lib64/libc.so.6(__libc_start_main+0xeb)[0x7f4717e6324b]
> btrfs(_start+0x2a)[0x55cc4378b5ea]
> Aborted (core dumped)
> 
> --Larkin



signature.asc
Description: OpenPGP digital signature


Re: Scrub aborts due to corrupt leaf

2018-08-28 Thread Larkin Lowrey

On 8/27/2018 10:12 PM, Larkin Lowrey wrote:

On 8/27/2018 12:46 AM, Qu Wenruo wrote:



The system uses ECC memory and edac-util has not reported any errors.
However, I will run a memtest anyway.

So it should not be the memory problem.

BTW, what's the current generation of the fs?

# btrfs inspect dump-super  | grep generation

The corrupted leaf has generation 2862, I'm not sure how recent did the
corruption happen.


generation  358392
chunk_root_generation   357256
cache_generation    358392
uuid_tree_generation    358392
dev_item.generation 0

I don't recall the last time I ran a scrub but I doubt it has been 
more than a year.


I am running 'btrfs check --init-csum-tree' now. Hopefully that clears 
everything up.


No such luck:

Creating a new CRC tree
Checking filesystem on /dev/Cached/Backups
UUID: acff5096-1128-4b24-a15e-4ba04261edc3
Reinitialize checksum tree
csum result is 0 for block 2412149436416
extent-tree.c:2764: alloc_tree_block: BUG_ON `ret` triggered, value -28
btrfs(+0x1da16)[0x55cc43796a16]
btrfs(btrfs_alloc_free_block+0x207)[0x55cc4379c177]
btrfs(+0x1602f)[0x55cc4378f02f]
btrfs(btrfs_search_slot+0xed2)[0x55cc43790be2]
btrfs(btrfs_csum_file_block+0x48f)[0x55cc437a213f]
btrfs(+0x55cef)[0x55cc437cecef]
btrfs(cmd_check+0xd49)[0x55cc437ddbc9]
btrfs(main+0x81)[0x55cc4378b4d1]
/lib64/libc.so.6(__libc_start_main+0xeb)[0x7f4717e6324b]
btrfs(_start+0x2a)[0x55cc4378b5ea]
Aborted (core dumped)

--Larkin


Re: Scrub aborts due to corrupt leaf

2018-08-27 Thread Chris Murphy
On Mon, Aug 27, 2018 at 8:12 PM, Larkin Lowrey
 wrote:
> On 8/27/2018 12:46 AM, Qu Wenruo wrote:
>>
>>
>>> The system uses ECC memory and edac-util has not reported any errors.
>>> However, I will run a memtest anyway.
>>
>> So it should not be the memory problem.
>>
>> BTW, what's the current generation of the fs?
>>
>> # btrfs inspect dump-super  | grep generation
>>
>> The corrupted leaf has generation 2862, I'm not sure how recent did the
>> corruption happen.
>
>
> generation  358392
> chunk_root_generation   357256
> cache_generation358392
> uuid_tree_generation358392
> dev_item.generation 0
>
> I don't recall the last time I ran a scrub but I doubt it has been more than
> a year.
>
> I am running 'btrfs check --init-csum-tree' now. Hopefully that clears
> everything up.


I'd expect --init-csum-tree on recreates the data csum tree, and will
not assume metadata leaf is correct and just recompute a csum for it.


-- 
Chris Murphy


Re: Scrub aborts due to corrupt leaf

2018-08-27 Thread Larkin Lowrey

On 8/27/2018 12:46 AM, Qu Wenruo wrote:



The system uses ECC memory and edac-util has not reported any errors.
However, I will run a memtest anyway.

So it should not be the memory problem.

BTW, what's the current generation of the fs?

# btrfs inspect dump-super  | grep generation

The corrupted leaf has generation 2862, I'm not sure how recent did the
corruption happen.


generation  358392
chunk_root_generation   357256
cache_generation    358392
uuid_tree_generation    358392
dev_item.generation 0

I don't recall the last time I ran a scrub but I doubt it has been more 
than a year.


I am running 'btrfs check --init-csum-tree' now. Hopefully that clears 
everything up.


Thank you for your help and advice,

--Larkin


Re: Scrub aborts due to corrupt leaf

2018-08-26 Thread Qu Wenruo


On 2018/8/27 上午10:32, Larkin Lowrey wrote:
> On 8/26/2018 8:16 PM, Qu Wenruo wrote:
>> Corrupted tree block bytenr matches with the number reported by kernel.
>> You could provide the tree block dump for bytenr 7687860535296, and
>> maybe we could find out what's going wrong and fix it manually.
>>
>> # btrfs ins dump-tree -b 7687860535296 
> 
> Thank you for your reply.
> 
> # btrfs ins dump-tree -b 7687860535296 /dev/Cached/Backups
> btrfs-progs v4.15.1
> leaf free space ret -2002721201, leaf data size 16283, used 2002737484
> nritems 319
> leaf 7687860535296 items 319 free space -2002721201 generation 2862 owner 7
> leaf 7687860535296 flags 0x1(WRITTEN) backref revision 1
> fs uuid acff5096-1128-4b24-a15e-4ba04261edc3
> chunk uuid 0d2fdb5d-00c0-41b3-b2ed-39a5e3bf98aa
>     item 0 key (18446744073650847734 EXTENT_CSUM 8487178285056)
> itemoff 13211 itemsize 3072
>     range start 8487178285056 end 8487181430784 length 3145728
>     item 1 key (18446744073650880502 EXTENT_CSUM 8487174090752)
> itemoff 10139 itemsize 3072
>     range start 8487174090752 end 8487177236480 length 3145728
>     item 2 key (18446744073650913270 EXTENT_CSUM 8487167782912)
> itemoff 3251 itemsize 6888
>     range start 8487167782912 end 8487174836224 length 7053312
>     item 3 key (18446744073651011574 EXTENT_CSUM 8487166103552)
> itemoff 187 itemsize 3064
>     range start 8487166103552 end 8487169241088 length 3137536
>     item 4 key (58523648 UNKNOWN.0 4115587072) itemoff 0 itemsize 0

Starts from this item, the leaf is definitely corrupted.

>     item 5 key (58523648 UNKNOWN.0 4115058688) itemoff 0 itemsize 0
>     item 6 key (58392576 UNKNOWN.0 4115050496) itemoff 0 itemsize 0
>     item 7 key (58392576 UNKNOWN.0 9160800976331685888) itemoff
> 1325803612 itemsize 1549669347
[snip]
> Segmentation fault (core dumped)
> 
> Can I simply rebuild the csum tree (btrfs check --init-csum-tree)? The
> entire contents of the fs are back-up files that are hashed so I can
> verify that the files are correct.

Yes, I just forgot we have the --init-csum-tree option.

You could try that way, at least from previous check run, there is no
other serious corruption.

> 
>> Please note that this corruption could be caused by bad ram or some old
>> kernel bug.
>> It's recommend to run a memtest if possible.
> 
> The system uses ECC memory and edac-util has not reported any errors.
> However, I will run a memtest anyway.

So it should not be the memory problem.

BTW, what's the current generation of the fs?

# btrfs inspect dump-super  | grep generation

The corrupted leaf has generation 2862, I'm not sure how recent did the
corruption happen.

Thanks,
Qu

> 
> Thank you,
> 
> --Larkin



signature.asc
Description: OpenPGP digital signature


Re: Scrub aborts due to corrupt leaf

2018-08-26 Thread Larkin Lowrey

On 8/26/2018 8:16 PM, Qu Wenruo wrote:

Corrupted tree block bytenr matches with the number reported by kernel.
You could provide the tree block dump for bytenr 7687860535296, and
maybe we could find out what's going wrong and fix it manually.

# btrfs ins dump-tree -b 7687860535296 


Thank you for your reply.

# btrfs ins dump-tree -b 7687860535296 /dev/Cached/Backups
btrfs-progs v4.15.1
leaf free space ret -2002721201, leaf data size 16283, used 2002737484 
nritems 319

leaf 7687860535296 items 319 free space -2002721201 generation 2862 owner 7
leaf 7687860535296 flags 0x1(WRITTEN) backref revision 1
fs uuid acff5096-1128-4b24-a15e-4ba04261edc3
chunk uuid 0d2fdb5d-00c0-41b3-b2ed-39a5e3bf98aa
    item 0 key (18446744073650847734 EXTENT_CSUM 8487178285056) 
itemoff 13211 itemsize 3072

    range start 8487178285056 end 8487181430784 length 3145728
    item 1 key (18446744073650880502 EXTENT_CSUM 8487174090752) 
itemoff 10139 itemsize 3072

    range start 8487174090752 end 8487177236480 length 3145728
    item 2 key (18446744073650913270 EXTENT_CSUM 8487167782912) 
itemoff 3251 itemsize 6888

    range start 8487167782912 end 8487174836224 length 7053312
    item 3 key (18446744073651011574 EXTENT_CSUM 8487166103552) 
itemoff 187 itemsize 3064

    range start 8487166103552 end 8487169241088 length 3137536
    item 4 key (58523648 UNKNOWN.0 4115587072) itemoff 0 itemsize 0
    item 5 key (58523648 UNKNOWN.0 4115058688) itemoff 0 itemsize 0
    item 6 key (58392576 UNKNOWN.0 4115050496) itemoff 0 itemsize 0
    item 7 key (58392576 UNKNOWN.0 9160800976331685888) itemoff 
1325803612 itemsize 1549669347
    item 8 key (15706350841398176100 UNKNOWN.160 
9836230374950416562) itemoff -507102832 itemsize -1565142843
    item 9 key (16420776794030147775 UNKNOWN.139 
1413404178631177347) itemoff 319666572 itemsize -2033238481
    item 10 key (12490357187492557094 UNKNOWN.100 
8703020161114007581) itemoff 1698374107 itemsize 427239449
    item 11 key (10238910558655956878 UNKNOWN.145 
13172984620675614213) itemoff -1386707845 itemsize -2094889124
    item 12 key (14429452134272870167 UNKNOWN.47 
5095274587264087555) itemoff -385621303 itemsize -1014793681
    item 13 key (12392706351935785292 TREE_BLOCK_REF 
17075682359779944300) itemoff 467435242 itemsize -1974352848

    tree block backref
    item 14 key (9030638330689148475 UNKNOWN.146 
16510052416438219760) itemoff -1329727247 itemsize -989772882
    item 15 key (2557232588403612193 UNKNOWN.89 
11359249297629415033) itemoff -1393664382 itemsize -222178533
    item 16 key (16832668804185527807 UNKNOWN.190 
12813564574805698827) itemoff -824350641 itemsize 113587270
    item 17 key (17721977661761488041 UNKNOWN.133 
65181195353232031) itemoff 1165455420 itemsize -11248999
    item 18 key (17041494636387836535 UNKNOWN.146 
659630272632027956) itemoff 1646352770 itemsize 188954807
    item 19 key (4813797791329885851 UNKNOWN.147 
2988230942665281926) itemoff 2034137186 itemsize 429359084
    item 20 key (11925872190557602809 UNKNOWN.28 
10017979389672184473) itemoff 198274722 itemsize 1654501802
    item 21 key (18089916911465221293 UNKNOWN.215 
130744227189807288) itemoff -938569572 itemsize -322594079
    item 22 key (17582525817082834821 UNKNOWN.133 
14298100207216235213) itemoff 997305640 itemsize 380205383
    item 23 key (2509730330338250179 ORPHAN_ITEM 
8415032273173690331) itemoff 1213495256 itemsize -1813460706

    orphan item
    item 24 key (17657358590741059587 UNKNOWN.5 
4198714773705203243) itemoff -690501330 itemsize -237182892
    item 25 key (14784171376049469241 UNKNOWN.139 
15453005915765327150) itemoff 1543890422 itemsize 2093403168
    item 26 key (8296048569161577100 UNKNOWN.58 
12559616442258240580) itemoff 927535366 itemsize -620630864
    item 27 key (14738413134752477244 SHARED_BLOCK_REF 
90867799437527556) itemoff -629160915 itemsize 1418942359

    shared block backref
    item 28 key (17386064595326971933 SHARED_BLOCK_REF 
1813311842215708701) itemoff 1401681450 itemsize -2016124808

    shared block backref
    item 29 key (12068018374989506977 UNKNOWN.160 
1560146733122974605) itemoff -1145774613 itemsize -490403576
    item 30 key (5611751644962296316 QGROUP_LIMIT 
19245/207762978715732) itemoff -433607332 itemsize -854595036

Segmentation fault (core dumped)

Can I simply rebuild the csum tree (btrfs check --init-csum-tree)? The 
entire contents of the fs are back-up files that are hashed so I can 
verify that the files are correct.



Please note that this corruption could be caused by bad ram or some old
kernel bug.
It's recommend to run a memtest if possible.


The system uses ECC memory and edac-util has not reported any errors. 
However, I will run a memtest anyway.


Thank you,

--Larkin


Re: Scrub aborts due to corrupt leaf

2018-08-26 Thread Qu Wenruo


On 2018/8/27 上午4:45, Larkin Lowrey wrote:
> When I do a scrub it aborts about 10% of the way in due to:
> 
> corrupt leaf: root=7 block=7687860535296 slot=0, invalid key objectid
> for csum item, have 18446744073650847734 expect 18446744073709551606

This error message explains itself.

Key objectid is not valid.

> 
> The filesystem in question stores my backups and I have verified all of
> the backups so I know all files that are supposed to be there are there
> and their hashes match. Backups run normally and everything seems to
> work fine, it's just the scrub that doesn't.

No, scrub works as expected, during its csum fetching, it detects bad
csum tree block.
This means your csum tree is corrupted.

> 
> I tried:
> 
> # btrfs check --repair /dev/Cached/Backups
> enabling repair mode
> Checking filesystem on /dev/Cached/Backups
> UUID: acff5096-1128-4b24-a15e-4ba04261edc3
> Fixed 0 roots.
> checking extents
> leaf free space ret -2002721201, leaf data size 16283, used 2002737484
> nritems 319
> leaf free space ret -2002721201, leaf data size 16283, used 2002737484
> nritems 319

--repair doesn't support to repair such corruption, yet.

> leaf free space incorrect 7687860535296 -2002721201
> bad block 7687860535296

Corrupted tree block bytenr matches with the number reported by kernel.

You could provide the tree block dump for bytenr 7687860535296, and
maybe we could find out what's going wrong and fix it manually.

# btrfs ins dump-tree -b 7687860535296 

Please note that this corruption could be caused by bad ram or some old
kernel bug.
It's recommend to run a memtest if possible.

> ERROR: errors found in extent allocation tree or chunk allocation
> checking free space cache
> block group 34028518375424 has wrong amount of free space
> failed to load free space cache for block group 34028518375424
> checking fs roots
> root 5 inode 6784890 errors 1000, some csum missing
> checking csums
> there are no extents for csum range 6447630387207159216-6447630390115868080
> csum exists for 6447630387207159216-6447630390115868080 but there is no
> extent record
> there are no extents for csum range 763548178418734000-763548181428650928
> csum exists for 763548178418734000-763548181428650928 but there is no
> extent record
> there are no extents for csum range
> 10574442573086800664-10574442573732416280
> csum exists for 10574442573086800664-10574442573732416280 but there is
> no extent record
> ERROR: errors found in csum tree
> found 73238589853696 bytes used, error(s) found
> total csum bytes: 8117840900
> total tree bytes: 34106834944
> total fs tree bytes: 23289413632
> total extent tree bytes: 1659682816
> btree space waste bytes: 6020692848
> file data blocks allocated: 73136347418624
>  referenced 73135917441024
> 
> Nothing changes because when I run the above command again the output is
> identical.
> 
> I had been using space_cache v2 but reverted to nospace_cache to run the
> above.

The corrupted tree block is csum tree thus space_cache is not related.

> 
> Is there any way to clean this up?

Only manually patching is possible.
As the corruption looks pretty like a memory corruption.

Thanks,
Qu

> 
> kernel 4.17.14-202.fc28.x86_64
> btrfs-progs v4.15.1
> 
> Label: none  uuid: acff5096-1128-4b24-a15e-4ba04261edc3
>     Total devices 1 FS bytes used 66.61TiB
>     devid    1 size 72.77TiB used 68.03TiB path
> /dev/mapper/Cached-Backups
> 
> Data, single: total=67.80TiB, used=66.52TiB
> System, DUP: total=40.00MiB, used=7.41MiB
> Metadata, DUP: total=98.50GiB, used=95.21GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> BTRFS info (device dm-3): disk space caching is enabled
> BTRFS info (device dm-3): has skinny extents
> BTRFS info (device dm-3): bdev /dev/mapper/Cached-Backups errs: wr 0, rd
> 0, flush 0, corrupt 666, gen 25
> BTRFS info (device dm-3): enabling ssd optimizations
> 
> 
> 



signature.asc
Description: OpenPGP digital signature


Scrub aborts due to corrupt leaf

2018-08-26 Thread Larkin Lowrey

When I do a scrub it aborts about 10% of the way in due to:

corrupt leaf: root=7 block=7687860535296 slot=0, invalid key objectid 
for csum item, have 18446744073650847734 expect 18446744073709551606


The filesystem in question stores my backups and I have verified all of 
the backups so I know all files that are supposed to be there are there 
and their hashes match. Backups run normally and everything seems to 
work fine, it's just the scrub that doesn't.


I tried:

# btrfs check --repair /dev/Cached/Backups
enabling repair mode
Checking filesystem on /dev/Cached/Backups
UUID: acff5096-1128-4b24-a15e-4ba04261edc3
Fixed 0 roots.
checking extents
leaf free space ret -2002721201, leaf data size 16283, used 2002737484 
nritems 319
leaf free space ret -2002721201, leaf data size 16283, used 2002737484 
nritems 319

leaf free space incorrect 7687860535296 -2002721201
bad block 7687860535296
ERROR: errors found in extent allocation tree or chunk allocation
checking free space cache
block group 34028518375424 has wrong amount of free space
failed to load free space cache for block group 34028518375424
checking fs roots
root 5 inode 6784890 errors 1000, some csum missing
checking csums
there are no extents for csum range 6447630387207159216-6447630390115868080
csum exists for 6447630387207159216-6447630390115868080 but there is no 
extent record

there are no extents for csum range 763548178418734000-763548181428650928
csum exists for 763548178418734000-763548181428650928 but there is no 
extent record
there are no extents for csum range 
10574442573086800664-10574442573732416280
csum exists for 10574442573086800664-10574442573732416280 but there is 
no extent record

ERROR: errors found in csum tree
found 73238589853696 bytes used, error(s) found
total csum bytes: 8117840900
total tree bytes: 34106834944
total fs tree bytes: 23289413632
total extent tree bytes: 1659682816
btree space waste bytes: 6020692848
file data blocks allocated: 73136347418624
 referenced 73135917441024

Nothing changes because when I run the above command again the output is 
identical.


I had been using space_cache v2 but reverted to nospace_cache to run the 
above.


Is there any way to clean this up?

kernel 4.17.14-202.fc28.x86_64
btrfs-progs v4.15.1

Label: none  uuid: acff5096-1128-4b24-a15e-4ba04261edc3
    Total devices 1 FS bytes used 66.61TiB
    devid    1 size 72.77TiB used 68.03TiB path 
/dev/mapper/Cached-Backups


Data, single: total=67.80TiB, used=66.52TiB
System, DUP: total=40.00MiB, used=7.41MiB
Metadata, DUP: total=98.50GiB, used=95.21GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

BTRFS info (device dm-3): disk space caching is enabled
BTRFS info (device dm-3): has skinny extents
BTRFS info (device dm-3): bdev /dev/mapper/Cached-Backups errs: wr 0, rd 
0, flush 0, corrupt 666, gen 25

BTRFS info (device dm-3): enabling ssd optimizations