btrfs-progs: replace: error message can be improved when other operation is running
Dear all, when trying to replace a device of a file system for which a balance is running, btrfs-progs fails with the error message: ERROR: ioctl(DEV_REPLACE_START) on '/mnt/xyz' returns error: This might also be true for alike operations, such as "add", "delete" and "resize", since those cases do not seem to be considered in cmds-replace.c [0]. Apparently, this is not very helpful to the user (if not scary). In contrast, other commands give very helpful output in similar situations (e.g., "add/delete/… operation in progress" [1]). Other users' confusions might also be related to this potential issue [2]. This is probably very easy to fix for someone into ioctl return values and all this. Thanks and cheers, Lukas GNU/Linux 4.13.0-0.bpo.1-amd64 #1 SMP Debian 4.13.13-1~bpo9+1 (2017-11-22) x86_64 btrfs-progs v4.13.3 [0] https://github.com/kdave/btrfs-progs/blob/11c83cefb8b4a03b1835efaf603ddc95430a0c9e/cmds-replace.c#L48 [1] https://github.com/kdave/btrfs-progs/blob/9fe889ac02b9c49b885c8999f5dd4e192697fa83/ioctl.h#L709 [2] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=866734 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: zstd compression
Hi Imran, On 11/15/2017 09:51 AM, Imran Geriskovan wrote as excerpted: > Any further advices? you might be interested in the thread "Read before you deploy btrfs + zstd"¹. Cheers, Lukas ¹ https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg69871.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Several questions regarding btrfs
On 11/01/2017 03:05 PM, ST wrote as excerpted: >> However, it's important to know that if your users have shell access, >> they can bypass qgroups. Normal users can create subvolumes, and new >> subvolumes aren't added to an existing qgroup by default (and unless I'm >> mistaken, aren't constrained by the qgroup set on the parent subvolume), >> so simple shell access is enough to bypass quotas. > I never did it before, but shouldn't it be possible to just whitelist > commands users are allowed to use in the SSH config (and so block > creation of subvolumes/cp --reflink)? I actually would have restricted > users to sftp if I knew how to let them change their passwords once they > wish to. As far as I know it is not possible with OpenSSH... Possible only via a rather custom setup, I guess. You could a) force users into a chroot via the sshd configuration (chroots need allowed binaries plus their libs and configs etc.), b) solve the problem with file permissions on all binaries (probably a terrible pain to setup (users, groups, …) and maintain) Cheers, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub crashes OS
On 09/26/2017 11:36 AM, Qu Wenruo wrote as excerpted: > This is strange, this means that we can't find a chunk map for a 72K > length data extent. > > Either the new mapper code has some bug, or it's a big problem. > But I think it's more possible for former case. > > Would you please try to dump the chunk tree (which should be quite > small) using the following command? > > $ btrfs inspect-internal dump-tree -t chunk Sure, happy to provide that: https://static.lukas-pirl.de/dump-chunk-tree.txt (too large for Pastebin, file will probably go away in a couple of weeks). Cheers, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub crashes OS
Hi Qu, On 09/26/2017 10:51 AM, Qu Wenruo wrote as excerpted: > This make things more weird. > Just in case, are you executing offline scrub by "btrfs scrub start > --offline " Yes. I even got some output (pretty sure the last lines are missing due to the crash): WARNING: Offline scrub doesn't support extra options other than -r [I gave -d as well] Invalid mapping for 644337258496-644337332224, got 645348196352-646421938176 Couldn't map the block 644337258496 ERROR: failed to read out data at bytenr 644337258496 mirror 1 Invalid mapping for 653402148864-653402152960, got 653938130944-655011872768 Couldn't map the block 653402148864 ERROR: failed to read out data at bytenr 653402148864 mirror 1 Invalid mapping for 717315420160-717315526656, got 718362640384-719436382208 Couldn't map the block 717315420160 ERROR: failed to read out data at bytenr 717315420160 mirror 1 Invalid mapping for 875072008192-875072040960, got 875128946688-876202688512 Couldn't map the block 875072008192 ERROR: failed to read tree block 875072008192 mirror 1 ERROR: extent 875072008192 len 32768 CORRUPTED: all mirror(s) corrupted, can't be recovered Can I find out on which disk a mirror of a block is? > If so, I think there may be some problem outside the btrfs territory. Of course, that is a possibility… > Offline scrub has nothing to do with btrfs kernel module, it just reads > out on-disk data and verify checksum in *user* space. > > So if offline scrub can also screw up the system, it means there is > something wrong in the disk IO routine, not btrfs. > > And scrub can trigger it because normal btrfs IO won't try to read that > part/mirror. …especially when considering this. > What about trying to read all data out of your raw disk? > If offline crashes the system, reading the disk may crash it also. > Using dd to read each of your disk (with btrfs unmounted) may expose > which disk caused the problem. That it is good idea! Will go ahead. Thanks for your help so far. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub crashes OS
Dear Qu, thanks for your reply. On 09/25/2017 12:19 PM, Qu Wenruo wrote as excerpted: > Even no dmesg output using tty or netconsole? And thanks for the pointer to netconsole, I tried that one. No success. I set netconsole up, verified it worked, started a scrub, the machine went away after a couple of hours, netconsole empty. > That's strange. > Normally it should be kernel BUG_ON() to cause such problem. > > And if the system is still responsible (either from TTY or ssh), is > there anything strange like tons of IO or CPU usage? I can't tell, the machine just disappears from the network. Dead. IIRC, it was also all dead when I sat in front of it. > Btrfs-progs v4.13 should have fixed it. > As long as v4.13 btrfs check reports no error, its metadata should be > good. I can try that one, if helpful. > You could try the out-of-tree offline scrub to do a full scrub of your > fs unmounted, so it won't crash your system (if nothing wrong happened) > https://github.com/gujx2017/btrfs-progs/tree/offline_scrub Did that, machine crashed again. >> MIXED_BACKREF, BIG_METADATA, EXTENDED_IREF, SKINNY_METADATA, NO_HOLES > > Only NO_HOLES is not ordinary, but shouldn't cause a problem. Would it be sensible to turn that feature off using `btrfstune` (if possible at all)? > Without kernel backtrace, it's tricky to locate the problem. > So I would recommend to use netconsole (IIRC more reliable, as I use it > on my test VM to capture the dying message) or TTY output to verify > there is no kernel message/backtrace. Yeah I see we are in a tricky situation here. I will try to scrub with autodefrag and compression deactivated. Could a full balance be of any help? At least to find out if it crashes the machine as well? Cheers, Lukas > Thanks, > Qu > >> no quotas in use >> see also https://pastebin.com/4me6zDsN for more details >> btrfs-progs v4.12 >> GNU/Linux 4.12.0-0.bpo.1-amd64 #1 SMP Debian 4.12.6-1~bpo9+1 x86_64 >> >> The question, obviously, is how can I make this fs "scrubable" again? >> Are the errors found by btrfsck safe to repair using btrfsck or some >> other tool? >> >> Thank you so much in advance, >> >> Lukas >> -- >> To unsubscribe from this list: send the line "unsubscribe >> linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Wrong device?
On 09/25/2017 06:11 PM, linux-bt...@oh3mqu.pp.hyper.fi wrote as excerpted: > After a long googling (about more complex situations) I suddenly > noticed "device sdb" WTF??? Filesystem is mounted from /dev/md3 (sdb > is part of that mdraid) so btrfs should not even know anything about > that /dev/sdb. I would be interested in explanations regarding this too. It happened to me as well, that I was confused by /dev/sd* device paths being printed by btrfs in the logs, even though it runs on /dev/md-* (/dev/mapper/*) devices exclusively. > PS. I have noticed another bug too, but I haven't tested it with > lastest kernels after I noticed that it happens only with > compression=lzo. So maybe it is already fixed. With gzip or none > compression probem does not happens. I have email server with about > 0.5 TB volume. It is using Maildir so it contains huge amount of > files. Sometimes some files goes unreadable. After server reset > problematic file could be readable again (but not always)... > > But weird thing is that unreadable file always seems to be > dovecot.index.log. Confirm this (non-reproducible) behavior on a VPS running Debian 4.5.4-1~bpo8+1. Lukas -- +49 174 940 74 71 GPG key available via key servers signature.asc Description: OpenPGP digital signature
btrfs scrub crashes OS
Dear all, I experience reproducible OS crashes when scrubbing a btrfs file system. Apart from that, the file system mounts rw and is usable without any problems (including modifying snapshots and all that). When the system crashes (i.e., freezes), there are no errors printed to the system logs or via `dmesg` (had a display connected). Recovery is only possible via power-cycling the machine. The host experienced a lot of crashes and ATA errors due to hardware failures in the past. To the best of my knowledge, the hardware is stable now. `btrfs device stats` outputs zeros for all counters. `btrfsck --readonly --mode lowmem` outputs a bunch of referencer count mismatch … and ERROR: data extent[… …] backref lost see https://pastebin.com/seC4fReP for the full log. System info: btrfs RAID 1 (~1.5 years old), 7 SATA HDDs MIXED_BACKREF, BIG_METADATA, EXTENDED_IREF, SKINNY_METADATA, NO_HOLES no quotas in use see also https://pastebin.com/4me6zDsN for more details btrfs-progs v4.12 GNU/Linux 4.12.0-0.bpo.1-amd64 #1 SMP Debian 4.12.6-1~bpo9+1 x86_64 The question, obviously, is how can I make this fs "scrubable" again? Are the errors found by btrfsck safe to repair using btrfsck or some other tool? Thank you so much in advance, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Still in 4.4.0: livelock in recovery (free_reloc_roots)
On 11/20/2015 10:04 AM, Lukas Pirl wrote as excerpted: > I am (still) trying to recover a RAID1 that can only be mounted > recovery,degraded,ro. > > I experienced an issue that might be interesting for you: I tried to > mount the file system rw,recovery and the kernel ended up burning one > core (and only one specific core, never scheduled to another one). > > The watchdog printed a stack trace roughly every 20 seconds. There were > only a few stack traces that were printed alternating (see below). > After a few hours with the mount command still being blocked and without > visible IO activity, the system was power-cycled. > > Summary: > > Call Trace: > [] ? free_reloc_roots+0x11/0x30 [btrfs] > [] ? free_reloc_roots+0x1d/0x30 [btrfs] > [] ? merge_reloc_roots+0x165/0x220 [btrfs] > [] ? btrfs_recover_relocation+0x293/0x380 [btrfs] > [] ? open_ctree+0x20d2/0x23b0 [btrfs] > [] ? btrfs_mount+0x87b/0x990 [btrfs] > [] ? pcpu_next_unpop+0x3f/0x50 > [] ? mount_fs+0x36/0x170 > [] ? vfs_kern_mount+0x68/0x110 > [] ? btrfs_mount+0x1bb/0x990 [btrfs] > … > > Call Trace: >[] ? rcu_dump_cpu_stacks+0x80/0xb0 > [] ? rcu_check_callbacks+0x421/0x6e0 > [] ? sched_clock+0x5/0x10 > [] ? notifier_call_chain+0x45/0x70 > [] ? timekeeping_update+0xf1/0x150 > [] ? tick_sched_do_timer+0x40/0x40 > [] ? update_process_times+0x36/0x60 > [] ? tick_sched_do_timer+0x40/0x40 > [] ? tick_sched_handle.isra.15+0x24/0x60 > [] ? tick_sched_do_timer+0x40/0x40 > [] ? tick_sched_timer+0x3b/0x70 > [] ? __hrtimer_run_queues+0xdc/0x210 > [] ? read_tsc+0x5/0x10 > [] ? read_tsc+0x5/0x10 > [] ? hrtimer_interrupt+0x9a/0x190 > [] ? smp_apic_timer_interrupt+0x39/0x50 > [] ? apic_timer_interrupt+0x6b/0x70 >[] ? _raw_spin_lock+0x10/0x20 > [] ? __del_reloc_root+0x2f/0x100 [btrfs] > [] ? __add_reloc_root+0xe0/0xe0 [btrfs] > [] ? free_reloc_roots+0x1d/0x30 [btrfs] > [] ? merge_reloc_roots+0x165/0x220 [btrfs] > [] ? btrfs_recover_relocation+0x293/0x380 [btrfs] > [] ? open_ctree+0x20d2/0x23b0 [btrfs] > [] ? btrfs_mount+0x87b/0x990 [btrfs] > [] ? pcpu_next_unpop+0x3f/0x50 > [] ? mount_fs+0x36/0x170 > [] ? vfs_kern_mount+0x68/0x110 > [] ? btrfs_mount+0x1bb/0x990 [btrfs] > … > > Call Trace: > [] ? __del_reloc_root+0x2f/0x100 [btrfs] > [] ? free_reloc_roots+0x1d/0x30 [btrfs] > [] ? merge_reloc_roots+0x165/0x220 [btrfs] > [] ? btrfs_recover_relocation+0x293/0x380 [btrfs] > [] ? open_ctree+0x20d2/0x23b0 [btrfs] > [] ? btrfs_mount+0x87b/0x990 [btrfs] > [] ? pcpu_next_unpop+0x3f/0x50 > [] ? mount_fs+0x36/0x170 > [] ? vfs_kern_mount+0x68/0x110 > [] ? btrfs_mount+0x1bb/0x990 [btrfs] > … > > A longer excerpt can be found here: http://pastebin.com/NPM0Ckfy > > I am using kernel 4.2.6 (Debian backports) and btrfs-tools 4.3. > > btrfs check --readonly gave no errors. > (except the probably false positives mentioned here > http://www.mail-archive.com/linux-btrfs%40vger.kernel.org/msg48325.html) > > Reading the whole file system worked also. > > If you need more information to trace this back, let me know and I'll > try to get it. > If you have suggestions regarding the recovery, please let me know as well. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fixing recursive fault and parent transid verify failed
On 12/07/2015 02:57 PM, Alistair Grant wrote as excerpted: > Fixing recursive fault, but reboot is needed For the record: I saw the same message (incl. hard lockup) when doing a balance on a single-disk btrfs. Besides that, the fs works flawlessly (~60GB, usage: no snapshots, ~15 lxc containers, low-load databases, few mails, a couple of Web servers). As this is a production machine, I rather rebooted the machine instead of investigating but the error is reproducible if that would be of great interest. > I've ran btrfs scrub and btrfsck on the drives, with the output > included below. Based on what I've found on the web, I assume that a > btrfs-zero-log is required. > > * Is this the recommended path? > * Is there a way to find out which files will be affected by the loss of > the transactions? > Kernel: Ubuntu 4.2.0-19-generic (which is based on mainline 4.2.6) I used Debian Backports 4.2.6. Cheers, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: implications of mixed mode
On 11/27/2015 04:11 PM, Duncan wrote as excerpted: > My big hesitancy would be over that fact that very few will run or test > mixed-mode at TB scale filesystem level, and where they do, it's likely > to be in ordered to work around the current (but set to soon be > eliminated) metadata-only (no data) dup mode limit on single-device, > since in that regard mixed-mode is treated as metadata and dup mode is > allowed. > > So you're relatively more likely to run into rarely seen scaling issues > and perhaps bugs that nobody else has ever run into as (relatively) > nobody else runs mixed-mode on multi-terabyte-scale btrfs. If you want > to be the guinea pig and make it easier for others to try later on, after > you've flushed out the worst bugs, that's definitely one way to do it. > =:^] I see. This somehow aligns with Qu's answer. > It's worth noting that rsync... seems to stress btrfs more than pretty > much any other common single application. It's extremely heavy access > pattern just seems to trigger bugs that nothing else does, and while they > do tend to get fixed, it really does seem to push btrfs to the limits, > and there have been a /lot/ of rsync triggered btrfs bugs reported over > the years. Well, IMHO btrfs /has/ to deal with rsync workloads if it wants to be an alternative for larger storages but that is another story. I do run btrfs (non-mixed) with rsync workloads for quite a while now and it is doing well (except for the deadlock that has been around a while ago). Maybe my network is just slow enough to not trigger any unfixed weird issues with the intense access patterns of rsync. Anyways, thanks for the hint! > Between the stresses of rsyncing half a TiB daily and the relatively > untested quantity that is mixed-mode btrfs at multi-terabyte scales on > multi-devices, there's a reasonably high chance that you /will/ be > working with the devs on various bugs for awhile. If you're willing to > do it, great, somebody putting the filesystem thru those kinds of mixed- > mode paces at that scale is just the sort of thing we need to get > coverage on that particular not yet well tested corner case, but don't > expect it to be particularly stable for a couple kernel cycles anyway, > and after that, you'll still be running a particularly rare corner-case > that's likely to put new code thru its paces as well, so just be aware of > the relatively stony path you're signing up to navigate, should you > choose to go that route. Makes perfect sense. I think I sadly do not have the resources to be that guinea pig… > Meanwhile, assuming you're /not/ deliberately setting out to test a > rarely tested corner-case with stress tests known to rather too > frequently get the best of btrfs... > > Why are you considering mixed-mode here? At that size the ENOSPC hassles > of unmixed-mode btrfs on say single-digit GiB and below really should be > dwarfed into insignificance, particularly since btrfs since 3.17 or so > deletes empty chunks instead of letting them build up to the point where > they're a problem, so what possible reason, other than simply to test it > and cover that corner-case, could justify mixed-mode at that sort of > scale? > > Unless of course, given that you didn't mention number of devices or > individual device size, only the 8 TB total, you have in mind a raid of > something like 1000 8-GB USB sticks, or the like, in which case mixed- > mode on the individual sticks might make some sense (well, to the extent > that a 1000-device raid of /anything/ makes sense! =:^), given their 8-GB > each size. That is not the case. I just came to the consideration because I wondered why mixed-mode is not generally preferred when data and metadata have the same replication level. Thanks Duncan! Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
implications of mixed mode
Dear list, if a larger RAID file system (say disk space of 8 TB in total) is created in mixed mode, what are the implications? >From reading the mailing list and the Wiki, I can think of the following: + less hassle with "false positive" ENOSPC - data and metadata have to have the same replication level forever (e.g. RAID 1) - higher fragmentation (does this reduce with no(dir)atime?) -> more work for autodefrag Is that roughly what is to be expected? Any implications on recovery etc.? In the specific case, the file system usage is as follows: * data spread over ~20 subvolumes * snapshotted with various frequencies * compression is used * mostly archive storage * write once * read infrequently * ~500GB of daily rsync'ed system backup Thanks in advance, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 4.2.6: livelock in recovery (free_reloc_roots)?
On 11/21/2015 10:01 PM, Alexander Fougner wrote as excerpted: > This is fixed in btrfs-progs 4.3.1, that allows you to delete a > device again by the 'missing' keyword. Thanks Alexander! I just found the thread reporting the bug but not the patch with the corresponding btrfs-tools version it was merged in. Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 4.2.6: livelock in recovery (free_reloc_roots)?
On 11/21/2015 08:16 PM, Duncan wrote as excerpted: > Lukas Pirl posted on Sat, 21 Nov 2015 13:37:37 +1300 as excerpted: > >> > Can "btrfs_recover_relocation" prevented from being run? I would not >> > mind losing a few recent writes (what was a balance) but instead going >> > rw again, so I can restart a balance. > I'm not familiar with that thread name (I run multiple small btrfs on > ssds, so scrub, balance, etc, take only a few minutes at most), but if First, thank you Duncan for taking the time to hack in those broad explanations. I am not sure if this name also corresponds to a thread name, but it is for sure a function that appears in all the dumped traces when trying to 'mount -o recovery,degraded' the file system in question: [] ? free_reloc_roots+0x1d/0x30 [btrfs] [] ? merge_reloc_roots+0x165/0x220 [btrfs] [] ? btrfs_recover_relocation+0x293/0x380 [btrfs] [] ? open_ctree+0x20d2/0x23b0 [btrfs] [] ? btrfs_mount+0x87b/0x990 [btrfs] > it's the balance thread, then yes, there's a mount option that cancels a > running balance. See the wiki page covering mount options. Yes, the file system is mounted with '-o skip_balance'. (Although the '-o recovery' might trigger relocations?!) >> > From what I have read, btrfs-zero-log would not help in this case (?) so >> > I did not run it so far. > Correct. Btrfs is atomic at commit time, so doesn't need a journal in > the sense of older filesystems like reiserfs, jfs and ext3/4. > … > Otherwise, it generally does no good, and while > it generally does no serious harm beyond the loss of a few seconds worth > of fsyncs, etc, either, because the commits /are/ atomic and zeroing the > log simply returns the system to the state of such a commit, it's not > recommended as it /does/ needlessly kill the log of those last few > seconds of fsyncs. So I see that it does no good but no serious harm (generally). Since it is related to writes (not relocations, I assume) clearing the log is unlikely to fix the problem with btrfs_recover_relocation or merge_reloc_roots, respectively. Maybe a dev helps us and shines some light in the (I assume) impossible relocation issue. Best, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
4.2.6: livelock in recovery (free_reloc_roots)?
Dear list, I am (still) trying to recover a RAID1 that can only be mounted recovery,degraded,ro. I experienced an issue that might be interesting for you: I tried to mount the file system rw,recovery and the kernel ended up burning one core (and only one specific core, never scheduled to another one). The watchdog printed a stack trace roughly every 20 seconds. There were only a few stack traces that were printed alternating (see below). After a few hours with the mount command still being blocked and without visible IO activity, the system was power-cycled. Summary: Call Trace: [] ? free_reloc_roots+0x11/0x30 [btrfs] [] ? free_reloc_roots+0x1d/0x30 [btrfs] [] ? merge_reloc_roots+0x165/0x220 [btrfs] [] ? btrfs_recover_relocation+0x293/0x380 [btrfs] [] ? open_ctree+0x20d2/0x23b0 [btrfs] [] ? btrfs_mount+0x87b/0x990 [btrfs] [] ? pcpu_next_unpop+0x3f/0x50 [] ? mount_fs+0x36/0x170 [] ? vfs_kern_mount+0x68/0x110 [] ? btrfs_mount+0x1bb/0x990 [btrfs] … Call Trace: [] ? rcu_dump_cpu_stacks+0x80/0xb0 [] ? rcu_check_callbacks+0x421/0x6e0 [] ? sched_clock+0x5/0x10 [] ? notifier_call_chain+0x45/0x70 [] ? timekeeping_update+0xf1/0x150 [] ? tick_sched_do_timer+0x40/0x40 [] ? update_process_times+0x36/0x60 [] ? tick_sched_do_timer+0x40/0x40 [] ? tick_sched_handle.isra.15+0x24/0x60 [] ? tick_sched_do_timer+0x40/0x40 [] ? tick_sched_timer+0x3b/0x70 [] ? __hrtimer_run_queues+0xdc/0x210 [] ? read_tsc+0x5/0x10 [] ? read_tsc+0x5/0x10 [] ? hrtimer_interrupt+0x9a/0x190 [] ? smp_apic_timer_interrupt+0x39/0x50 [] ? apic_timer_interrupt+0x6b/0x70 [] ? _raw_spin_lock+0x10/0x20 [] ? __del_reloc_root+0x2f/0x100 [btrfs] [] ? __add_reloc_root+0xe0/0xe0 [btrfs] [] ? free_reloc_roots+0x1d/0x30 [btrfs] [] ? merge_reloc_roots+0x165/0x220 [btrfs] [] ? btrfs_recover_relocation+0x293/0x380 [btrfs] [] ? open_ctree+0x20d2/0x23b0 [btrfs] [] ? btrfs_mount+0x87b/0x990 [btrfs] [] ? pcpu_next_unpop+0x3f/0x50 [] ? mount_fs+0x36/0x170 [] ? vfs_kern_mount+0x68/0x110 [] ? btrfs_mount+0x1bb/0x990 [btrfs] … Call Trace: [] ? __del_reloc_root+0x2f/0x100 [btrfs] [] ? free_reloc_roots+0x1d/0x30 [btrfs] [] ? merge_reloc_roots+0x165/0x220 [btrfs] [] ? btrfs_recover_relocation+0x293/0x380 [btrfs] [] ? open_ctree+0x20d2/0x23b0 [btrfs] [] ? btrfs_mount+0x87b/0x990 [btrfs] [] ? pcpu_next_unpop+0x3f/0x50 [] ? mount_fs+0x36/0x170 [] ? vfs_kern_mount+0x68/0x110 [] ? btrfs_mount+0x1bb/0x990 [btrfs] … A longer excerpt can be found here: http://pastebin.com/NPM0Ckfy I am using kernel 4.2.6 (Debian backports) and btrfs-tools 4.3. btrfs check --readonly gave no errors. (except the probably false positives mentioned here http://www.mail-archive.com/linux-btrfs%40vger.kernel.org/msg48325.html) Reading the whole file system worked also. If you need more information to trace this back, let me know and I'll try to get it. If you have suggestions regarding the recovery, please let me know as well. Best regards, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 4.2.6: livelock in recovery (free_reloc_roots)?
A follow-up question: Can "btrfs_recover_relocation" prevented from being run? I would not mind losing a few recent writes (what was a balance) but instead going rw again, so I can restart a balance. >From what I have read, btrfs-zero-log would not help in this case (?) so I did not run it so far. By the way, I can confirm the defect of 'btrfs device remove missing …" mentioned here: http://www.spinics.net/lists/linux-btrfs/msg48383.html : $ btrfs device delete missing /mnt/data ERROR: missing is not a block device $ btrfs device delete 5 /mnt/data ERROR: 5 is not a block device Thanks and best regards, Lukas On 11/20/2015 10:04 PM, Lukas Pirl wrote as excerpted: > Dear list, > > I am (still) trying to recover a RAID1 that can only be mounted > recovery,degraded,ro. > > I experienced an issue that might be interesting for you: I tried to > mount the file system rw,recovery and the kernel ended up burning one > core (and only one specific core, never scheduled to another one). > > The watchdog printed a stack trace roughly every 20 seconds. There were > only a few stack traces that were printed alternating (see below). > After a few hours with the mount command still being blocked and without > visible IO activity, the system was power-cycled. > > Summary: > > Call Trace: > [] ? free_reloc_roots+0x11/0x30 [btrfs] > [] ? free_reloc_roots+0x1d/0x30 [btrfs] > [] ? merge_reloc_roots+0x165/0x220 [btrfs] > [] ? btrfs_recover_relocation+0x293/0x380 [btrfs] > [] ? open_ctree+0x20d2/0x23b0 [btrfs] > [] ? btrfs_mount+0x87b/0x990 [btrfs] > [] ? pcpu_next_unpop+0x3f/0x50 > [] ? mount_fs+0x36/0x170 > [] ? vfs_kern_mount+0x68/0x110 > [] ? btrfs_mount+0x1bb/0x990 [btrfs] > … > > Call Trace: >[] ? rcu_dump_cpu_stacks+0x80/0xb0 > [] ? rcu_check_callbacks+0x421/0x6e0 > [] ? sched_clock+0x5/0x10 > [] ? notifier_call_chain+0x45/0x70 > [] ? timekeeping_update+0xf1/0x150 > [] ? tick_sched_do_timer+0x40/0x40 > [] ? update_process_times+0x36/0x60 > [] ? tick_sched_do_timer+0x40/0x40 > [] ? tick_sched_handle.isra.15+0x24/0x60 > [] ? tick_sched_do_timer+0x40/0x40 > [] ? tick_sched_timer+0x3b/0x70 > [] ? __hrtimer_run_queues+0xdc/0x210 > [] ? read_tsc+0x5/0x10 > [] ? read_tsc+0x5/0x10 > [] ? hrtimer_interrupt+0x9a/0x190 > [] ? smp_apic_timer_interrupt+0x39/0x50 > [] ? apic_timer_interrupt+0x6b/0x70 >[] ? _raw_spin_lock+0x10/0x20 > [] ? __del_reloc_root+0x2f/0x100 [btrfs] > [] ? __add_reloc_root+0xe0/0xe0 [btrfs] > [] ? free_reloc_roots+0x1d/0x30 [btrfs] > [] ? merge_reloc_roots+0x165/0x220 [btrfs] > [] ? btrfs_recover_relocation+0x293/0x380 [btrfs] > [] ? open_ctree+0x20d2/0x23b0 [btrfs] > [] ? btrfs_mount+0x87b/0x990 [btrfs] > [] ? pcpu_next_unpop+0x3f/0x50 > [] ? mount_fs+0x36/0x170 > [] ? vfs_kern_mount+0x68/0x110 > [] ? btrfs_mount+0x1bb/0x990 [btrfs] > … > > Call Trace: > [] ? __del_reloc_root+0x2f/0x100 [btrfs] > [] ? free_reloc_roots+0x1d/0x30 [btrfs] > [] ? merge_reloc_roots+0x165/0x220 [btrfs] > [] ? btrfs_recover_relocation+0x293/0x380 [btrfs] > [] ? open_ctree+0x20d2/0x23b0 [btrfs] > [] ? btrfs_mount+0x87b/0x990 [btrfs] > [] ? pcpu_next_unpop+0x3f/0x50 > [] ? mount_fs+0x36/0x170 > [] ? vfs_kern_mount+0x68/0x110 > [] ? btrfs_mount+0x1bb/0x990 [btrfs] > … > > A longer excerpt can be found here: http://pastebin.com/NPM0Ckfy > > I am using kernel 4.2.6 (Debian backports) and btrfs-tools 4.3. > > btrfs check --readonly gave no errors. > (except the probably false positives mentioned here > http://www.mail-archive.com/linux-btrfs%40vger.kernel.org/msg48325.html) > > Reading the whole file system worked also. > > If you need more information to trace this back, let me know and I'll > try to get it. > If you have suggestions regarding the recovery, please let me know as well. > > Best regards, > > Lukas > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bad extent [5993525264384, 5993525280768), type mismatch with chunk
On 11/21/2015 01:47 PM, Qu Wenruo wrote as excerpted: > Hard to say, but we'd better keep an eye on this issue. > At least, if it happens again, we should know if it's related to > something like newer kernel or snapshots. I can confirm the initially describe behavior of "btrfs check" and reading the data works fine also. Versions etc.: $ uname -a Linux 4.2.0-0.bpo.1-amd64 #1 SMP Debian 4.2.6-1~bpo8+1 … $ btrfs filesystem show /mnt/data Label: none uuid: 5be372f5-5492-4f4b-b641-c14f4ad8ae23 Total devices 6 FS bytes used 2.87TiB devid1 size 931.51GiB used 636.00GiB path /dev/mapper/…SZ devid2 size 931.51GiB used 634.03GiB path /dev/mapper/…03 devid3 size 1.82TiB used 1.53TiB path /dev/mapper/…76 devid4 size 1.82TiB used 1.53TiB path /dev/mapper/…78 devid6 size 1.82TiB used 1.05TiB path /dev/mapper/…UK *** Some devices missing btrfs-progs v4.3 $ btrfs subvolume list /mnt/data | wc -l 62 Best, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: anything wrong with `balance -dusage -musage` together?
On 11/20/2015 12:59 PM, Hugo Mills wrote as excerpted: >Nothing actively wrong with that, no. It certainly won't break > anything. It's just rarely actually useful. The usual situation is > that you run out of one kind of storage before the other (data vs > metadata, that is), and you need to free up some allocation of one of > them so it can go to the other. This is typically too much data > allocation, and metadata has run out (so -d is more often used than > -m). > >For the "usual" case of running out of metadata allocation, you > don't actually need much space to reclaim, so -dlimit=X for small X is > an easier approach to use. Thanks Hugo for your quick reply. Alright, the look at https://github.com/kdave/btrfsmaintenance just made me think over regular balances using -*usage before one or the other space runs out (as suggested there). Best, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
anything wrong with `balance -dusage -musage` together?
Hi list, I rarely see balance used with -dusage -musage together, esp. with values other than zero. The question is, is there anything wrong with running (say) `balance -dusage=50 -musage=30` regularly? Thanks and best regards, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: corrupted RAID1: unsuccessful recovery / help needed
TL;DR: thanks but recovery still preferred over recreation. Hello Duncan and thanks for your reply! On 10/26/2015 09:31 PM, Duncan wrote: FWIW... Older btrfs userspace such as your v3.17 is "OK" for normal runtime use, assuming you don't need any newer features, as in normal runtime, it's the kernel code doing the real work and userspace for the most part simply makes the appropriate kernel calls to do that work. > But, once you get into a recovery situation like the one you're in now, current userspace becomes much more important, as the various things you'll do to attempt recovery rely far more on userspace code directly accessing the filesystem, and it's only the newest userspace code that has the latest fixes. So for a recovery situation, the newest userspace release (4.2.2 at present) as well as a recent kernel is recommended, and depending on the problem, you may at times need to run integration or apply patches on top of that. I am willing to update before trying further repairs. Is e.g. "balance" also influenced by the userspace tools or does the kernel the actual work? General note about btrfs and btrfs raid. Given that btrfs itself remains a "stabilizing, but not yet fully mature and stable filesystem", while btrfs raid will often let you recover from a bad device, sometimes that recovery is in the form of letting you mount ro, so you can access the data and copy it elsewhere, before blowing away the filesystem and starting over. If there is one subvolume that contains all other (read only) snapshots and there is insufficient storage to copy them all separately: Is there an elegant way to preserve those when moving the data across disks? Back to the problem at hand. Current btrfs has a known limitation when operating in degraded mode. That being, a btrfs raid may be write- mountable only once, degraded, after which it can only be read-only mounted. This is because under certain circumstances in degraded mode, btrfs will fall back from its normal raid mode to single mode chunk allocation for new writes, and once there's single-mode chunks on the filesystem, btrfs mount isn't currently smart enough to check that all chunks are actually available on present devices, and simply jumps to the conclusion that there's single mode chunks on the missing device(s) as well, so refuses to mount writable after that in ordered to prevent further damage to the filesystem and preserve the ability to mount at least ro, to copy off what isn't damaged. There's a patch in the pipeline for this problem, that checks individual chunks instead of leaping to conclusions based on the presence of single- mode chunks on a degraded filesystem with missing devices. If that's your only problem (which the backtraces might reveal but I as a non-dev btrfs user can't tell), the patches should let you mount writable. Interesting, thanks for the insights. But that patch isn't in kernel 4.2. You'll need at least kernel 4.3-rc, and possibly btrfs integration, or to cherrypick the patches onto 4.2. Well, before digging into that, a hint that this is actually the case would be appreciated. :) Meanwhile, in keeping with the admin's rule on backups, by definition, if you valued the data more than the time and resources necessary for a backup, by definition, you have a backup available, otherwise, by definition, you valued the data less than the time and resources necessary to back it up. Therefore, no worries. Regardless of the fate of the data, you saved what your actions declared of most valuable to you, either the data, or the hassle and resources cost of the backup you didn't do. As such, if you don't have a backup (or if you do but it's outdated), the data at risk of loss is by definition of very limited value. That said, it appears you don't even have to worry about loss of that very limited value data, since mounting degraded,recovery,ro gives you stable access to it, and you can use the opportunity provided to copy it elsewhere, at least to the extent that the data we already know is of limited value is even worth the hassle of doing that. Which is exactly what I'd do. Actually, I've had to resort to btrfs restore[1] a couple times when the filesystem wouldn't mount at all, so the fact that you can mount it degraded,recovery,ro, already puts you ahead of the game. =:^) So yeah, first thing, since you have the opportunity, unless your backups are sufficiently current that it's not worth the trouble, copy off the data while you can. Then, unless you wish to keep the filesystem around in case the devs want to use it to improve btrfs' recovery system, I'd just blow it away and start over, restoring the data from backup once you have a fresh filesystem to restore to. That's the simplest and fastest way to a fully working system once again, and what I did here after using btrfs restore to recover the delta between current and my backups. Thanks for all the elaborations. I guess there are
corrupted RAID1: unsuccessful recovery / help needed
TL;DR: RAID1 does not recover, I guess the interesting part in the stack trace is: Call Trace: [] __del_reloc_root+0x30/0x100 [btrfs] [] free_reloc_roots+0x25/0x40 [btrfs] [] merge_reloc_roots+0x18e/0x240 [btrfs] [] btrfs_recover_relocation+0x374/0x420 [btrfs] [] open_ctree+0x1b7d/0x23e0 [btrfs] [] btrfs_mount+0x94e/0xa70 [btrfs] [] ? find_next_bit+0x15/0x20 [] mount_fs+0x38/0x160 … Hello list. I'd appreciate some help for repairing a corrupted RAID1. Setup: * Linux 4.2.0-12, Btrfs v3.17, `btrfs fi show`: uuid: 5be372f5-5492-4f4b-b641-c14f4ad8ae23 Total devices 6 FS bytes used 2.87TiB devid 1 size 931.51GiB used 636.00GiB path /dev/mapper/WD-WCC4J7AFLTSZ devid 2 size 931.51GiB used 634.03GiB path /dev/mapper/WD-WCAU45343103 devid 3 size 1.82TiB used 1.53TiB path /dev/mapper/WD-WCAVY6423276 devid 4 size 1.82TiB used 1.53TiB path /dev/mapper/WD-WCAZAF872578 devid 6 size 1.82TiB used 1.05TiB path /dev/mapper/WD-WMC4M0H3Z5UK *** Some devices missing * disks are dm-crypted What happened: * devid 5 started to die (slowly) * added a new disk (devid 6) and tried `btrfs device delete` * failed with kernel crashes (guess:) due to heavy IO errors * removed devid 5 from /dev (deactivated in dm-crypt) * tried `btrfs balance` * interrupted multiple times due to kernel crashes (probably due to semi-corrupted file system?) * file system did not mount anymore after a required hard-reset * no successful recovery so far: if not read-only, kernel IO blocks eventually (hard-reset required) * tried: * `-o degraded` -> IO freeze, kernel log: http://pastebin.com/Rzrp7XeL * `-o degraded,recovery` -> IO freeze, kernel log: http://pastebin.com/VemHfnuS * `-o degraded,recovery,ro` -> file system accessible, system stable * going rw again does not fix the problem I did not btrfs-zero-log so far because my oops did not look very similar to the one in the Wiki and I did not want to risk to make recovery harder. Thanks, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html