a developer will get around to looking at it.
It is safe to try:
mount -o ro,norecovery,usebackuproot /device/ /mnt/
If that works, I suggest updating your backup while it's still
possible in the meantime.
--
Chris Murphy
sk the drive for SCT ERC status. Simplest way to
know is do 'smartctl -x' on one drive, assuming they're all the same
basic make/model other than size.
--
Chris Murphy
complete dmesg; but errno=-5 IO failure is
pretty much some kind of hardware problem in my experience. I haven't
seen it be a bug.
--
Chris Murphy
a
developer.
The thing I'd like to see is
# btrfs rescue super -v /anydevice/
# btrfs insp dump-s -f /anydevice/
First command will tell us if all the supers are the same and valid
across all devices. And the second one, hopefully it's pointed to a
device with valid super, will tell us if there's a
RAID all the time. And arguably
it leads to unnecessary data loss in even the single device
desktop/laptop use case as well.
Chris Murphy
ch for each? That's a major
reduction in writes, and suggests it might be possible for further
optimization, to help mitigate the wandering trees impact.
--
Chris Murphy
doesn't really matter, but I'd get whatever
you can off the drive. I expect avoiding a rebuild in some form or
another is very wishful thinking and not very likely.
The more changes are made to the file system, repair attempts or
otherwise writing to it, decreases the chance of recovery.
--
Chris Murphy
by default runs fstrim.service once a
week (which in turn issues fstrim, I think on all mounted volumes.)
I am a bit more concerned about the read errors you had that were
being corrected automatically? The corruption suggests a firmware bug
related to trim. I'd check the affected SSD firmware revision and
consider updating it (only after a backup, it's plausible the firmware
update is not guaranteed to be data safe). Does the volume use DUP or
raid1 metadata? I'm not sure how it's correcting for these problems
otherwise.
--
Chris Murphy
t in? It matters if this is a
continuous appending format, or if it's writing them out as individual
JPEG files, one per frame, or whatever. What rate, what size, and any
other concurrent operations, etc.
--
Chris Murphy
that expects containers to be transient disposable
objects.
--
Chris Murphy
s
all the missing copies, they're distributed. Which means you've got a
very good chance in a 2 drive failure of losing two copies of either
metadata or data or both. While I'm not certain it's 100% not
survivable, the real gotcha is it's possible maybe even likely that
it'll mount and seem to work fine but as soon as it runs into two
missing bg's, it'll face plant.
--
Chris Murphy
Also, since you don't have any snapshots, you could also find this
conventionally:
# du -sh /*
Chris Murphy
mnt
cd /mnt
btrfs fi du -s *
Maybe that will help reveal where it's hiding. It's possible btrfs fi
du does not cross bind mounts. I know the Total column does include
amounts in nested subvolumes.
--
Chris Murphy
lk errors). Mounting with both ro and
nologreplay will ensure no writes are needed, allowing the mount to
succeed. of course any changes that are in the log tree will be
missing so recent transactions may be unrecoverable but so far I've
had good luck recovering from broken SD cards this way.
ameters but
> it doesn't change anything nor trying btrf check in single user mode.
>
> Where is my 30 Go missing ?
--
Chris Murphy
ely discards very
recently stale Btrfs metadata and can make recovery from crashes
harder).
There is a trim bug that causes FITRIM to only get applied to
unallocated space on older file systems, that have been balanced such
that block group logical addresses are outside the physical address
space of the device which prevents the free space inside of such block
groups to be passed over for FITRIM. Looks like this will be fixed in
kernel 4.20/5.0
--
Chris Murphy
dora or Arch live or install media,
mount the Btrfs and try to read the problem files and see if the
problem still happens. I can't even being to estimate the tens of
thousands of line changes since kernel 4.9.
What profile are you using for this Btrfs? Is this a raid56? What do
you get for 'btrfs fi us ' ?
--
Chris Murphy
his will drop the use of the treelog which is used to improve
performance on operations that use fsync. With this option,
transactions calling fsync() fall back to sync() so it's safer but
slower.
--
Chris Murphy
y copy
files from a directory to /dev/null and then check kernel messages for
any errors. So long as metadata is DUP, there is a good chance a bad
copy of metadata can be automatically fixed up with a good copy. If
there's only single copy of metadata, or both copies get corrupt, then
it's difficult. Usually recovery of data is possible, but depending on
what's damaged, repair might not be possible.
--
Chris Murphy
d from seed to sprout, and that the
sprout can be unmounted.
--
Chris Murphy
" (or more of a limitation)
if the guest is using cache=none on the block device?
Anton what virtual machine tech are you using? qemu/kvm managed with
virt-manager? The configuration affects host behavior; but the
negative effect manifests inside the guest as corruption. If I
remember correctly.
--
Chris Murphy
On Tue, Oct 16, 2018 at 2:13 AM, Anand Jain wrote:
>
>
> On 10/14/2018 06:28 AM, Chris Murphy wrote:
>>
>> Is it practical and desirable to make Btrfs based OS installation
>> images reproducible? Or is Btrfs simply too complex and
>> non-deterministic?
On Mon, Oct 15, 2018 at 3:26 PM, Anton Shepelev wrote:
> Chris Murphy to Anton Shepelev:
>
>> > How can I track down the origin of this mount point:
>> >
>> > /dev/sda2 on /home/hana type btrfs
>> > (rw,relatime,space_cache,subvoli
untu and you're using Timeshift?
Maybe it'll show up in the journal if you add boot parameter
'systemd.log_level=debug' and reboot; then use 'journalctl -b | grep
mount' and it should show all instances logged instances of mount
events: systemd, udisks2, maybe others?
--
Chris Murphy
On Mon, Oct 15, 2018 at 6:29 AM, Austin S. Hemmelgarn
wrote:
> On 2018-10-13 18:28, Chris Murphy wrote:
>> The end result is creating two Btrfs volumes would yield image files
>> with matching hashes.
>
> So in other words, you care about matching the block layout _exactly_.
seed device 2.0" idea. But Btrfs is so
complicated it's maybe too much work, hence the question.
--
Chris Murphy
make_ext4) to set all timestamps to
this value, and configurable uuid's for everything that uses uuids,
and whatever other constraints are necessary.
--
Chris Murphy
On Sat, Oct 13, 2018 at 4:28 PM, Chris Murphy wrote:
> Is it practical and desirable to make Btrfs based OS installation
> images reproducible? Or is Btrfs simply too complex and
> non-deterministic? [1]
>
> The main three problems with Btrfs right now for reproducibility are:
&g
cking.
[1] problems of reproducible system images
https://reproducible-builds.org/docs/system-images/
[2] purpose and motivation for reproducible builds
https://reproducible-builds.org/
[3] who is involved?
https://reproducible-builds.org/who/#Qubes%20OS
--
Chris Murphy
What version of btrfs-progs?
trace 7470f1b607c73b6c ]---
[103780.285841] BTRFS warning (device mmcblk0p3):
cleanup_transaction:1847: Aborting unused transaction(No space left).
[103780.289891] BTRFS info (device mmcblk0p3): delayed_refs has NO entry
--
Chris Murphy
On Wed, Oct 10, 2018 at 9:07 PM, Larkin Lowrey
wrote:
> On 10/10/2018 10:51 PM, Chris Murphy wrote:
>>
>> On Wed, Oct 10, 2018 at 8:12 PM, Larkin Lowrey
>> wrote:
>>>
>>> On 10/10/2018 7:55 PM, Hans van Kranenburg wrote:
>>>>
>>>> On
On Wed, Oct 10, 2018 at 8:12 PM, Larkin Lowrey
wrote:
> On 10/10/2018 7:55 PM, Hans van Kranenburg wrote:
>>
>> On 10/10/2018 07:44 PM, Chris Murphy wrote:
>>>
>>>
>>> I'm pretty sure you have to umount, and then clear the space_cache
>>> wi
e kernel code will not mount a Btrfs if the first super is
not present or valid (checksum match)?
--
Chris Murphy
en but as it
is we have nothing really to go on.
--
Chris Murphy
the rebuild? Or something else?
c. I think the devs would like to see the output from btrfs-progs
v4.17.1, 'btrfs check --mode=lowmem' and see if it finds anything, in
particular something not related to free space cache.
Rebuilding either version of space cache requires successfully reading
(and parsing) the extent tree.
--
Chris Murphy
On Tue, Oct 9, 2018 at 11:25 AM, Andrei Borzenkov wrote:
> 09.10.2018 18:52, Chris Murphy пишет:
>>> In this case is root/big_file and snapshot/big_file still share the same
>>> data?
>>
>> You'll be left with three files. /big_file and root/big_file wil
ed
extents with /big_file - or deduplicate.
--
Chris Murphy
seconds time elapsed
[chris@flap ~]$
Seems like a lot of activity for just a few transactions, but what
really caught my eye here is the qgroup reporting for a file system
that has never had qgroups enabled. Is it expected?
Chris Murphy
out, rather than do a SATA link reset in which case Btrfs can't do
anything about it).
--
Chris Murphy
e how well understood they are. But other people don't have
problems with it.
It's worth looking through the archives about some things. Btrfs
raid56 isn't exactly perfectly COW, there is read-modify-write code
that means there can be overwrites. I vaguely recall that it's COW in
the logical layer, but the physical writes can end up being RMW or not
for sure COW.
--
Chris Murphy
nnot get kdump to work. The crashkernel is loaded and everything is
> setup for it afaict. I asked a question on this over at stackexchange but no
> answer yet.
> https://unix.stackexchange.com/questions/469838/linux-kdump-does-not-boot-second-kernel-when-kernel-is-crashing
>
> So i did a little digging and added some debug printk() statements to see
> whats going on and it seems that panic() is never called. maybe the second
> stack trace is the reason?
> Screenshot is here: https://t-5.eu/owncloud/index.php/s/OegsikXo4VFLTJN
>
> Could someone please tell me where I can report this problem and get some
> help on this topic?
Try kexec mailing list. They handle kdump.
http://lists.infradead.org/mailman/listinfo/kexec
--
Chris Murphy
Adding fsdevel@, linux-ext4, and btrfs@ (which has a separate subject
on this same issue)
On Wed, Sep 19, 2018 at 7:45 PM, Dave Chinner wrote:
>On Wed, Sep 19, 2018 at 10:23:38AM -0600, Chris Murphy wrote:
>> Fedora 29 has a new feature to test if boot+startup fails, so the
>>
On Mon, Sep 17, 2018 at 9:44 PM, Chris Murphy wrote:
> https://btrfs.wiki.kernel.org/index.php/FAQ#Does_grub_support_btrfs.3F
>
> Does anyone know if this is still a problem on Btrfs if grubenv has
> xattr +C set? In which case it should be possible to overwrite and
> t
le devices? Eek!
>
> Recompute the parity should not be a big deal. Updating all the (b)trees
> would be a too complex goal.
I think it's just asking for trouble. Sometimes the best answer ends
up being no, no and definitely no.
--
Chris Murphy
On Tue, Sep 18, 2018 at 1:01 PM, Andrei Borzenkov wrote:
> 18.09.2018 21:57, Chris Murphy пишет:
>> On Tue, Sep 18, 2018 at 12:16 PM, Andrei Borzenkov
>> wrote:
>>> 18.09.2018 08:37, Chris Murphy пишет:
>>
>>>> The patches aren't upstream yet
those distros that support Secure Boot, in practice you're
stuck with the behavior of their prebuilt GRUB binary that goes on the
ESP.
--
Chris Murphy
On Tue, Sep 18, 2018 at 12:16 PM, Andrei Borzenkov wrote:
> 18.09.2018 08:37, Chris Murphy пишет:
>> The patches aren't upstream yet? Will they be?
>>
>
> I do not know. Personally I think much easier is to make grub location
> independent of /boot, allowing grub
On Tue, Sep 18, 2018 at 11:15 AM, Goffredo Baroncelli
wrote:
> On 18/09/2018 06.21, Chris Murphy wrote:
>> b. The bootloader code, would have to have sophisticated enough Btrfs
>> knowledge to know if the grubenv has been reflinked or snapshot,
>> because even if +C
On Mon, Sep 17, 2018 at 11:24 PM, Andrei Borzenkov wrote:
> 18.09.2018 07:21, Chris Murphy пишет:
>> On Mon, Sep 17, 2018 at 9:44 PM, Chris Murphy
>> wrote:
>>> https://btrfs.wiki.kernel.org/index.php/FAQ#Does_grub_support_btrfs.3F
>>>
>>> Does anyon
On Mon, Sep 17, 2018 at 9:44 PM, Chris Murphy wrote:
> https://btrfs.wiki.kernel.org/index.php/FAQ#Does_grub_support_btrfs.3F
>
> Does anyone know if this is still a problem on Btrfs if grubenv has
> xattr +C set? In which case it should be possible to overwrite and
> t
for, effectively out of tree
code, to be making modifications to the file system, outside of the
file system.
--
Chris Murphy
distro support, I don't often see SUSE users on
here.
OpenZFS is a different strategy because they're using out of tree
code. So you can run older kernels, and compile the current openzfs
code base against your older kernel. In effect you're using an older
distro kernel, but with new file system code base supported by that
upstream.
--
Chris Murphy
but I'm not sure
>> what version, maybe by 4.14?
>
> Sounds about right -- my version is 4.7.3.
It's not dangerous to use it (maybe --repair is more dangerous but
don't use it without advice first, no matter version). You just don't
get new features and bug fixes. It's also not dangerous to use
something much newer, again if the user space tools are very new and
the kernel is old, you just don't get certain features.
--
Chris Murphy
ssages, user space messages aren't
enough.
Anyway, good luck with openzfs, cool project.
--
Chris Murphy
eate /bkp/backup-subvol
> cp -prv --reflink=always /bkp/backup/* /bkp/backup-subvol/
Yeah that will take a lot of writes that are not necessary, now that
you see backup is a subvolume already. If you want a copy of it, just
snapshot it.
--
Chris Murphy
all the metadata is fully read,
modified (new inodes) and written out.
But either way it should work.
--
Chris Murphy
e using the subvol= or
subvolid= mount options, you need to noatime every time, once per file
system isn't enough.
--
Chris Murphy
(resend to all)
On Thu, Sep 13, 2018 at 9:44 AM, Nikolay Borisov wrote:
>
>
> On 13.09.2018 18:30, Chris Murphy wrote:
>> This is the 2nd or 3rd thread containing hanging btrfs send, with
>> kernel 4.18.x. The subject of one is "btrfs send hung in pipe_wait"
0.rc3.git2.1 - which
translates to git 54eda9df17f3.
Chris Murphy
ing root refs done with fs roots in lowmem mode, skipping
> [7/7] checking quota groups skipped (not enabled on this FS)
> found 708354711552 bytes used, no error found
> total csum bytes: 689206904
> total tree bytes: 2423865344
> total fs tree bytes: 1542914048
> total extent tree bytes: 129843200
> btree space waste bytes: 299191292
> file data blocks allocated: 31709967417344
> referenced 928531877888
OK good to know.
--
Chris Murphy
. It is slow, however.
--
Chris Murphy
t Btrfs volumes.
All I can say is you need to keep changing things up, process of
elimination. Rather tedious. Maybe you could try downloading a Fedora
28 ISO, make a boot stick out of it, and try to reproduce with the
same drives. At least that's an easy way to isolate the OS from the
equation.
--
Chris Murphy
basically becomes unuseable.
What kernel? Latest stable is 4.18.6. but I want to make sure that's
what you're using, someone else has reported btrfs send problems in
another thread with 4.18.5 that sound similar.
--
Chris Murphy
ainly in the kernel, where the receive code is
mainly in user space tools, for this testing you don't need to
downgrade user space tools. If there's a bug here, I expect it's
kernel.
--
Chris Murphy
get a discrete error message from the drive for one
of those and Btrfs overwrote that bad sector with a good copy (it's in
a raid1 volume), so working as designed I guess.
Since you didn't get a fix up message from Btrfs, either the whole
thing just got confused with hanging tasks, or it's possible it's a
data block.
--
Chris Murphy
en I directly connect USB
3.0 drives to my Intel NUC, but I don't ever get them when plugging
the drive into a dyconn hub. So if you don't already have a hub in
between the drive and the computer, it might be worth considering.
Basically the hub is going to read and completely rewrite the whole
str
Those are all read only commands, nothing is written or changed.
--
Chris Murphy
That does appear to affect
performance for some things, including send/receive.
--
Chris Murphy
a
1KiB sector size, but Btrfs (on x86_64) still uses 4096 byte "sector"
and it all seems to work fine despite that.
Anyway, maybe it's useful for some fstests instead of file backed
losetup devices?
--
Chris Murphy
subvolume to be
treated as if it were read only even if the volume is mounted read
only. And it takes a read only subvolume for send to work.
--
Chris Murphy
racle
> (and most of java, correct?).
Not really. The ZFS we care about now is OpenZFS, forked from Oracle's
ZFS. And a bunch of people not related to Oracle do that work. And
Btrfs has a wide assortment of developers: Facebook, SUSE, Fujitsu,
Oracle, and more.
--
Chris Murphy
On Mon, Sep 3, 2018 at 4:23 AM, Adam Borowski wrote:
> On Sun, Sep 02, 2018 at 09:15:25PM -0600, Chris Murphy wrote:
>> For > 10 years drive firmware handles bad sector remapping internally.
>> It remaps the sector logical address to a reserve physical sector.
>>
>>
t, and the actual mystery is if that double message is for both
drives even though only sda2 is named both times (the first two lines
of your dmesg). There are some kinds of memory related corruption that
newer versions of btrfs-progs can fix. I'm not sure if 4.4 is new
enough, or if the particular corruption you're seeing is something
btrfs check can fix, but I still wouldn't use --repair until Qu or
another dev says to give it a shot.
--
Chris Murphy
On Sat, Sep 1, 2018 at 1:03 AM, Pierre Couderc wrote:
>
>
> On 08/31/2018 08:52 PM, Chris Murphy wrote:
>>
>>
>> Bad sector which is failing write. This is fatal, there isn't anything
>> the block layer or Btrfs (or ext4 or XFS) can do about it. Well,
>> e
firmware would know that. It's not likely
a cable problem or something like. And that the write error is
reported at all means it's persistent, not transient.
Chris Murphy
> Aug 31 17:36:38 server kernel: sd 0:0:0:0: [sda] CDB: Write(10) 2a 00 00 61
> 9c 00 00 0a 00 00
> Aug 31 17:36:38 server kernel: blk_update_request: I/O error, dev sda,
> sector 6396928
Bad sector which is failing write. This is fatal, there isn't anything
the block layer or Btrfs (or ext4 or XFS) can do about it. Well,
ext234 do have an option to scan for bad sectors and create a bad
sector map which then can be used at mkfs time, and ext234 will avoid
using those sectors. And also the md driver has a bad sector option
for the same, and does remapping. But XFS and Btrfs don't do that.
If the drive is under warranty, get it swapped out, this is definitely
a warranty covered problem.
--
Chris Murphy
Additionally, the description of -d and -R doesn't help me distinquish
between the two. -R says "instead of a summary" so that suggests -d
will summarize but isn't explicitly stated.
--
Chris Murphy
of /dev/loop0p4: No such file or directory
[chris@f28h ~]$
I guess that's a good sign in this case?
Chris Murphy
ss actually pointing to the address
where the EBR is. And that EBR's first record contains the actual real
extended partition information.
So this represents two bugs in the installer:
1. If there's only one partition on a drive, it should be primary by
default, not extended.
2. But if extended, it must point to an EBR, and the EBR must be
created at that location. Obviously since there is no /dev/sdb2, this
EBR is not present.
--
Chris Murphy
merely a delayed discovery of one of
the devices. Once a Btrfs volume is degraded, it does not
automatically resume normal operation just because the formerly
missing device becomes available.
So... this is flat out not suitable for use cases where you need
unattended raid1 degraded boot.
--
Chris Murphy
do-release-upgrade are
> managed for auto cleanup
Ha! I should have read all the emails.
Anyway, good sleuthing. I think it's a good idea to file a bug report
on it, so at the least other people can fix it manually.
--
Chris Murphy
the wandering trees with compression to
reduce overall writes. But if your eMMC is soldered onto a board, I
might consider F2FS instead. And Btrfs for other things.
--
Chris Murphy
ad
only to avoid causing worse problems, so next time you should qualify
the drive before putting it into production."
I'm willing to bet all the other file system devs would say something
like that even if Btrfs devs think something better could happen, it's
probably not a super high priority.
--
Chris Murphy
time a write failure
happens, the operation is always fatal regardless of the file system.
--
Chris Murphy
s used 66.61TiB
>devid1 size 72.77TiB used 68.03TiB path /dev/mapper/Cached-Backups
>
>Data, single: total=67.80TiB, used=66.52TiB
>System, DUP: total=40.00MiB, used=7.41MiB
>Metadata, DUP: total=98.50GiB, used=95.21GiB
>GlobalReserve, single: total=512.00MiB, used=0.00B
Even if all metadata is only csum tree, and ~200GiB needs to be
written, there's plenty of free space for it.
--
Chris Murphy
ing Btrfs, but you may have snapshots.
'sudo btrfs sub list -at /'
That should show all subvolumes (includes snapshots).
> [48479.254106] BTRFS info (device mmcblk0p3): 17 enospc errors during balance
Probably soft enospc errors it was able to work around.
--
Chris Murphy
. Hopefully that clears
> everything up.
I'd expect --init-csum-tree on recreates the data csum tree, and will
not assume metadata leaf is correct and just recompute a csum for it.
--
Chris Murphy
artctl ...` results.
OK so nothing fatal anyway. We'd have to see any kernel messages that
appeared during the balance to see if there were read or write errors,
but presumably any failure means the balance fails so... might get you
by for a while actually.
--
Chris Murphy
On Mon, Aug 27, 2018 at 6:38 PM, Chris Murphy wrote:
>> Metadata,single: Size:8.00MiB, Used:0.00B
>>/dev/mapper/master-root 8.00MiB
>>
>> Metadata,DUP: Size:2.00GiB, Used:562.08MiB
>>/dev/mapper/master-root 4.00GiB
>>
>> System,s
being an
asshole again.
> Chris Murphy , 28 Ağu 2018 Sal, 02:25
> tarihinde şunu yazdı:
>>
>> On Mon, Aug 27, 2018 at 4:51 PM, Cerem Cem ASLAN
>> wrote:
>> > Hi,
>> >
>> > I'm getting DRDY ERR messages which causes system crash on the server:
And by 4.14 I actually mean 4.14.60 or 4.14.62 (based on the
changelog). I don't think the single patch in 4.14.62 applies to your
situation.
starts from that point
again. I've done quite a lot of jerkface reboot -f and sysrq + b with
Btrfs and have never broken a file system so far (power failures,
different story) but maybe I'm lucky and I have a bunch of well
behaved devices.
--
Chris Murphy
On Fri, Aug 10, 2018 at 9:29 PM, Duncan <1i5t5.dun...@cox.net> wrote:
> Chris Murphy posted on Fri, 10 Aug 2018 12:07:34 -0600 as excerpted:
>
>> But whether data is shared or exclusive seems potentially ephemeral, and
>> not something a sysadmin should even be able
really should be a high level basic per
directory quota implementation at the VFS layer, with a single kernel
interface as well as a single user space interface, regardless of the
file system. Additional file system specific quota features can of
course have their own tools, but all of this re-invention of the wheel
for basic directory quotas is a mystery to me.
--
Chris Murphy
,subvol=/root/var/lib/docker/btrfs)
And from the detail fpaste, you can see there is no such subvolume
docker/btrfs or docker/containers - and subvolid=265 is actually for
rootfs.
Anyway, mortals will be confused by this behavior.
--
Chris Murphy
device and there's nothing
Btrfs can do about that unless there is DUP or raid1+ metadata
available.
Is it possible this LV was accidentally reformatted ext4?
--
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.ke
by ssd_spread?
--
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Related on XFS list.
https://www.spinics.net/lists/linux-xfs/msg20722.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 18, 2018 at 12:01 PM, Austin S. Hemmelgarn
wrote:
> On 2018-07-18 13:40, Chris Murphy wrote:
>>
>> On Wed, Jul 18, 2018 at 11:14 AM, Chris Murphy
>> wrote:
>>
>>> I don't know for sure, but based on the addresses reported before and
>>> a
1 - 100 of 2056 matches
Mail list logo