Re: Any chance to get snapshot-aware defragmentation?
On domenica 20 maggio 2018 12:59:28 CEST, Tomasz Pala wrote: On Sat, May 19, 2018 at 10:56:32 +0200, Niccol? Belli wrote: snapper users with hourly snapshots will not have any use for it. Anyone with hourly snapshots anyone is doomed anyway. I do not agree: having hourly snapshots doesn't mean you cannot limit snapshots at a reasonable number. In fact you can simply keep a dozen of them, then start discarding the older ones. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Any chance to get snapshot-aware defragmentation?
On sabato 19 maggio 2018 01:55:30 CEST, Tomasz Pala wrote: The "defrag only not-snapshotted data" mode would be enough for many use cases and wouldn't require more RAM. One could run this before taking a snapshot and merge _at least_ the new data. snapper users with hourly snapshots will not have any use for it. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Any chance to get snapshot-aware defragmentation?
On venerdì 18 maggio 2018 20:33:53 CEST, Austin S. Hemmelgarn wrote: With a bit of work, it's possible to handle things sanely. You can deduplicate data from snapshots, even if they are read-only (you need to pass the `-A` option to duperemove and run it as root), so it's perfectly reasonable to only defrag the main subvolume, and then deduplicate the snapshots against that (so that they end up all being reflinks to the main subvolume). Of course, this won't work if you're short on space, but if you're dealing with snapshots, you should have enough space that this will work (because even without defrag, it's fully possible for something to cause the snapshots to suddenly take up a lot more space). Been there, tried that. Unfortunately even if I skip the defreg a simple duperemove -drhA --dedupe-options=noblock --hashfile=rootfs.hash rootfs is going to eat more space than it was previously available (probably due to autodefrag?). Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Any chance to get snapshot-aware defragmentation?
On venerdì 18 maggio 2018 19:10:02 CEST, Austin S. Hemmelgarn wrote: and also forces the people who have ridiculous numbers of snapshots to deal with the memory usage or never defrag Whoever has at least one snapshot is never going to defrag anyway, unless he is willing to double the used space. Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Any chance to get snapshot-aware defragmentation?
On venerdì 18 maggio 2018 18:20:51 CEST, David Sterba wrote: Josef started working on that in 2014 and did not finish it. The patches can be still found in his tree. The problem is in excessive memory consumption when there are many snapshots that need to be tracked during the defragmentation, so there are measures to avoid OOM. There's infrastructure ready for use (shrinkers), there are maybe some problems but fundamentally is should work. I'd like to get the snapshot-aware working again too, we'd need to find a volunteer to resume the work on the patchset. Yeah I know of Josef's work, but 4 years had passed since then without any news on this front. What I would really like to know is why nobody resumed his work: is it because it's impossible to implement snapshot-aware degram without excessive ram usage or is it simply because nobody is interested? Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs installation advices
On martedì 8 maggio 2018 09:50:23 CEST, Rolf Wald wrote: You need to build three partitions, e.g. named boot, swap, root. You don't need to use an unencrypted boot if you use grub: https://wiki.archlinux.org/index.php/Dm-crypt/Encrypting_an_entire_system#Encrypted_boot_partition_.28GRUB.29 A few hints for btrfs + LUKS + swap: https://wiki.archlinux.org/index.php/Dm-crypt/Encrypting_an_entire_system#Btrfs_subvolumes_with_swap Another solution is to use SED, as someone mentioned: https://wiki.archlinux.org/index.php/Self-Encrypting_Drives The only downside is that you can rest assured there will be NSA backdoors in hardware crypto. Even better I suggest you to move to ZFS and use Native Encryption: https://github.com/zfsonlinux/zfs/pull/5769 I recently got tired of btrfs never implementing things like snapshot-aware defrag (with no signs on the horizon that this is going to change soon) so I decided to switch my servers to ZFS. I'll let you know how crypto works if you're interested. I'll keep using btrfs on the clients though, at least for now. Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Any chance to get snapshot-aware defragmentation?
Hi, I'm waiting for this feature since years and initially it seemed like something which would have been worked on, sooner or later. A long time had passed without any progress on this, so I would like to know if there is any technical limitation preventing this or if it's something which could possibly land in the near future. Thanks, Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Kickstarting snapshot-aware defrag?
Hi, It seems to me that the proposal[1] for a snapshot-aware defrag has long been abandoned. Since most peoples badly need this feature I tought about how to possibly speed up the achievement of this goal. I know of several bounty-based kickstarting platforms, among them the best ones are probably bountysource.com[2] and freedomsponsors.org[3]. With both platforms everyone interested can place a bounty on the issue and if/when there will be someone interested to implement it he will get the bounty. I created an issue on both of them just to show how the platform will handle it. Since btrfs is a small community, before actually placing bounties and sponsoring it I would like to know if there is someone against this development model or someone interested in implementing a feature because of a bounty. Bests, Niccolò [1]https://www.spinics.net/lists/linux-btrfs/msg34539.html [2]https://www.bountysource.com/issues/50004702-feature-request-snapshot-aware-defrag [3]https://freedomsponsors.org/issue/817/feature-request-snapshot-aware-defrag?alert=KICKSTART -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Why do full balance and deduplication reduce available free space?
Il 2017-10-02 21:35 Kai Krakow ha scritto: Besides defragging removing the reflinks, duperemove will unshare your snapshots when used in this way: If it sees duplicate blocks within the subvolumes you give it, it will potentially unshare blocks from the snapshots while rewriting extents. BTW, you should be able to use duperemove with read-only snapshots if used in read-only-open mode. But I'd rather suggest to use bees instead: It works at whole-volume level, walking extents instead of files. That way it is much faster, doesn't reprocess already deduplicated extents, and it works with read-only snapshots. Until my patch it didn't like mixed nodatasum/datasum workloads. Currently this is fixed by just leaving nocow data alone as users probably set nocow for exactly the reason to not fragment extents and relocate blocks. Bad Btrfs Feature Interactions: btrfs read-only snapshots (never tested, probably wouldn't work well) Unfortunately it seems that bees doesn't support read-only snapshots, so it's a no way. P.S. I tried duperemove with -A, but besides taking much longer it didn't improve the situation. Are you sure that the culprit is duperemove? AFAIK it shouldn't unshare extents... Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Why do full balance and deduplication reduce available free space?
Maybe this is because of the autodefrag mount option? I thought it wasn't supposed to unshare lots of extents... Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Is it really possible to dedupe read-only snapshots!?
Il 2017-10-02 13:14 Paul Jones ha scritto: I use bees for deduplication and it will quite happily dedupe read-only snapshots. AFAIK no, it isn't possible. Source: https://www.spinics.net/lists/linux-btrfs/msg60385.html "It should be possible to deduplicate a read-only file to a read-write one, but that's probably not worth the effort in many real-world use cases." You could always change them to RW while dedupe is running then change back to RO. AFAIK it will break send/receive, can someone confirm? Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Why do full balance and deduplication reduce available free space?
Il 2017-10-02 12:16 Hans van Kranenburg ha scritto: On 10/02/2017 12:02 PM, Niccolò Belli wrote: [...] Since I use lots of snapshots [...] I had to create a systemd timer to perform a full balance and deduplication each night. Can you explain what's your reasoning behind this 'because X it needs Y'? I don't follow. Available free space is important to me, so I want snapshots to be deduplicated as well. Since I cannot deduplicate snapshots because they are read-only, then the data must be already deduplicated before the snapshots are taken. I do not consider the hourly snapshots because in a day they will be gone anyway, but daily snapshots will stay there for much longer so I want them to be deduplicated. Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Why do full balance and deduplication reduce available free space?
Hi, I have several subvolumes mounted with compress-force=lzo and autodefrag. Since I use lots of snapshots (snapper keeps around 24 hourly snapshots, 7 daily snapshots and 4 weekly snapshots) I had to create a systemd timer to perform a full balance and deduplication each night. In fact data needs to be already deduplicated when snapshots are created, otherwise I have no other way to deduplicate snapshots. This is how I performe balance: btrfs balance start --full-balance rootfs This is how I perform deduplication (duperemove is from git master): duperemove -drh --dedupe-options=noblock --hashfile=../rootfs.hash Looking at the logs I noticed something weird: available free space actually decreases after balance or deduplication. This is just before the timer starts: Overall: Device size: 128.00GiB Device allocated: 49.03GiB Device unallocated: 78.97GiB Device missing: 0.00B Used: 43.78GiB Free (estimated): 82.97GiB (min: 82.97GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:44.00GiB, Used:40.00GiB /dev/sda5 44.00GiB Metadata,single: Size:5.00GiB, Used:3.78GiB /dev/sda5 5.00GiB System,single: Size:32.00MiB, Used:16.00KiB /dev/sda5 32.00MiB Unallocated: /dev/sda5 78.97GiB I also manually performed a full balance just before the timer starts: Overall: Device size: 128.00GiB Device allocated: 46.03GiB Device unallocated: 81.97GiB Device missing: 0.00B Used: 43.78GiB Free (estimated): 82.96GiB (min: 82.96GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:41.00GiB, Used:40.01GiB /dev/sda5 41.00GiB Metadata,single: Size:5.00GiB, Used:3.77GiB /dev/sda5 5.00GiB System,single: Size:32.00MiB, Used:16.00KiB /dev/sda5 32.00MiB Unallocated: /dev/sda5 81.97GiB As you can see even doing a full balance was enough to reduce the available free space! Then the timer started and it performed the deduplication: Overall: Device size: 128.00GiB Device allocated: 46.03GiB Device unallocated: 81.97GiB Device missing: 0.00B Used: 43.87GiB Free (estimated): 82.94GiB (min: 82.94GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 176.00KiB) Data,single: Size:41.00GiB, Used:40.03GiB /dev/sda5 41.00GiB Metadata,single: Size:5.00GiB, Used:3.84GiB /dev/sda5 5.00GiB System,single: Size:32.00MiB, Used:16.00KiB /dev/sda5 32.00MiB Unallocated: /dev/sda5 81.97GiB Once again it reduced the available free space! Then, after the deduplication, the timer also performed a full balance: Overall: Device size: 128.00GiB Device allocated: 46.03GiB Device unallocated: 81.97GiB Device missing: 0.00B Used: 44.00GiB Free (estimated): 82.93GiB (min: 82.93GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:41.00GiB, Used:40.04GiB /dev/sda5 41.00GiB Metadata,single: Size:5.00GiB, Used:3.97GiB /dev/sda5 5.00GiB System,single: Size:32.00MiB, Used:16.00KiB /dev/sda5 32.00MiB Unallocated: /dev/sda5 81.97GiB It further reduced the available free space! Balance and deduplication actually reduced my available free space of 400MB! 400MB each night! How is it possible? Should I avoid doing balances and deduplications at all? Thanks, Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Cannot fix btrfs errors after system crash
Hi, I was trying to use AMDGPU-PRO's OpenCL stack (with the mainline 4.12.13 kernel) while it suddently crashed the whole system, not even magic sysrq keys did work anymore. With no surprise, at the next reboot I found several btrfs warnings (see https://paste.pound-python.org/show/S5zBG2tXZUTLG699saE5/). Since btrfs scrub didn't find any error I decided to reboot into a live usb and start a btrfs check (I'm using btrfs-progs 4.13). It did found lots of errors indeed (see https://paste.pound-python.org/show/IPxh9sly0EEb0MKPi2dw/). So I made a full backup with dd and I started a btrfs check --repair (see https://paste.pound-python.org/show/c9AlT8ehKKJy6l5xhzXk/). I also wiped the space cache with --clear-space-cache v1. A subsequent btrfs check revealed it indeed fixed lots of errors (see https://paste.pound-python.org/show/1m2Wodd1q3n0eRlxLpZB/), but unfortunately i still have the following errors: unresolved ref dir 7450239 index 2 namelen 6 name 431886 filetype 1 errors 80, filetype mismatch unresolved ref dir 7450595 index 2 namelen 6 name 431886 filetype 1 errors 80, filetype mismatch unresolved ref dir 7457122 index 2 namelen 6 name 431886 filetype 1 errors 80, filetype mismatch I'm already quite satisfied to be honest: two years ago repair used to eat my data, making things worse. Anyway, why didn't btrfs-check repair them? Is there anything I can do to fix them? Thanks, Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID56 status?
+1 On martedì 24 gennaio 2017 00:31:42 CET, Christoph Anton Mitterer wrote: On Mon, 2017-01-23 at 18:18 -0500, Chris Mason wrote: We've been focusing on the single-drive use cases internally. This year that's changing as we ramp up more users in different places. Performance/stability work and raid5/6 are the top of my list right now. +1 Would be nice to get some feedback on what happens behind the scenes... actually I think a regular btrfs development blog could be generally a nice thing :) Cheers, Chris. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Convert from RAID 5 to 10
On giovedì 1 dicembre 2016 10:37:13 CET, Wilson Meier wrote: The only thing i have asked for is to document the *known* problems/flaws/limitations of all raid profiles and link to them from the stability matrix. +1 Do someone mind if I ask for an account and I start copy-pasting any relevant post in this thread? Niccolò Belli -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Convert from RAID 5 to 10
I completely agree, the whole wiki status is simply *FRUSTRATING*. Niccolò Belli On mercoledì 30 novembre 2016 14:12:36 CET, Wilson Meier wrote: Am 30/11/16 um 11:41 schrieb Duncan: Wilson Meier posted on Wed, 30 Nov 2016 09:35:36 +0100 as excerpted: ... Hi Duncan, i understand your arguments but cannot fully agree. First of all, i'm not sticking with old stale versions of whatever as i try to keep my system up2date. My kernel is 4.8.4 (Gentoo) and btrfs-progs is 4.8.4. That being said, i'm quite aware of the heavy development status of btrfs but pointing the finger on the users saying that they don't fully understand the status of btrfs without giving the information on the wiki is in my opinion not the right way. Heavy development doesn't mean that features marked as ok are "not" or "mostly" ok in the context of overall btrfs stability. There is no indication on the wiki that raid1 or every other raid (except for raid5/6) suffers from the problems stated in this thread. If there are know problems then the stability matrix should point them out or link to a corresponding wiki entry otherwise one has to assume that the features marked as "ok" are in fact "ok". And yes, the overall btrfs stability should be put on the wiki. Just to give you a quick overview of my history with btrfs. I migrated away from MD Raid and ext4 to btrfs raid6 because of its CoW and checksum features at a time as raid6 was not considered fully stable but also not as badly broken. After a few months i had a disk failure and the raid could not recover. I looked at the wiki an the mailing list and noticed that raid6 has been marked as badly broken :( I was quite happy to have a backup. So i asked on the btrfs IRC channel (the wiki had no relevant information) if raid10 is usable or suffers from the same problems. The summary was "Yes it is usable and has no known problems". So i migrated to raid10. Now i know that raid10 (marked as ok) has also problems with 2 disk failures in different stripes and can in fact lead to data loss. I thought, hmm ok, i'll split my data and use raid1 (marked as ok). And again the mailing list states that raid1 has also problems in case of recovery. It is really disappointing to not have this information in the wiki itself. This would have saved me, and i'm quite sure others too, a lot of time. Sorry for being a bit frustrated. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Not TLS] mount option nodatacow for VMs on SSD?
On martedì 29 novembre 2016 06:14:18 CET, Duncan wrote: Very good question that I don't know the answer to as I've not seen it discussed previously. (I'm not a dev, just a list regular and user of btrfs myself, and my personal use-case involves neither snapshots nor send/receive, so on those topics if I've not seen it covered previously either here or on the wiki, I won't know.) Someone else know? Sounds too good to be real, I somehow feel the answer will be "no" :( Niccolò Belli -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mount option nodatacow for VMs on SSD?
On lunedì 28 novembre 2016 09:20:15 CET, Kai Krakow wrote: You can, however, use chattr to make the subvolume root directory (that one where it is mounted) nodatacow (chattr +C) _before_ placing any files or directories in there. That way, newly created files and directories will inherit the flag. Take note that this flag can only applied to directories and empty (zero-sized) files. Do I keep checksumming for this directory such a way? Niccolò Belli -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
My system mounts the wrong btrfs partition, from the wrong disk!
This is something pretty unbelievable, so I had to repeat it several times before finding the courage to actually post it to the mailing list :) After dozens of data loss I don't trust my btrfs partition that much, so I make a backup copy with dd weekly. Yesterday I was going to do some balancing and deduplication, but since the disk was full I had to remove the content of a whole subvolume (16GB) to make some space available to the tools (I had the backup made with dd on an external drive). After the balance and deduplication I attached and mounted the external drive with the backup and then I mounted the img file with the copy of the partition. To my great wonder the subvolume in the backup which should have had the files I deleted was empty! So I rebooted to a live usb and mounted the backup again: my files were still there, phew! Then I mounted the partition in the laptop's disk and I tried to copy the files from the backup and it complained that they already existed! If I unmount both the backup and the real disk and then I mount the real disk it's empty again as it should. These are the exact steps to reproduce it from the live usb: # Opening the encrypted partition from the real disk and then mounting it cryptsetup luksOpen /dev/sda5 cryptroot mount -o noatime,compress=lzo,autodefrag /dev/mapper/cryptroot /real_disk ls /real_disk/@Pictures --> empty (as it should) # Mounting the external disk with the backup mount /dev/sdb1 /external_disk # Mounting the unencrypted backup from the external disk mount /external_disk/backup.img /backup ls /backup/@Pictures ---> empty (*it shouldn't!*) umount /backup umount /external_disk cryptsetup luksClose cryptroot # Mounting the external disk with the backup mount /dev/sdb1 /external_disk # Mounting the unencrypted backup from the external disk mount /external_disk/backup.img /backup ls /backup/@Pictures ---> 16GB of photos (as it should) # Opening the encrypted partition from the real disk and then mounting it cryptsetup luksOpen /dev/sda5 cryptroot mount -o noatime,compress=lzo,autodefrag /dev/mapper/cryptroot /real_disk ls /real_disk/@Pictures --> 16GB of photos (it *shouldn't!*) I really don't know where the bug may lie, probably not even in btrfs but I didn't know where to report it. I'm using Archlinux with kernel 4.8.10 and the live is an Arch live usb with kernel 4.8 too. Niccolò Belli -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Increased disk usage after deduplication and system running out of memory
Unallocated: /dev/mapper/cryptroot 16.35GiB $ cat after_duperemove_and_balance Overall: Device size: 152.36GiB Device allocated:136.03GiB Device unallocated: 16.33GiB Device missing: 0.00B Used:133.81GiB Free (estimated): 16.55GiB (min: 16.55GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:127.00GiB, Used:126.77GiB /dev/mapper/cryptroot 127.00GiB Metadata,single: Size:9.00GiB, Used:7.03GiB /dev/mapper/cryptroot 9.00GiB System,single: Size:32.00MiB, Used:16.00KiB /dev/mapper/cryptroot 32.00MiB Unallocated: /dev/mapper/cryptroot 16.33GiB As you can see it freed 5.41 GB of data, but it also added 5.24 GB of metadata. The estimated free space is now 16.55 GB, while before the deduplication it was higher: 17.17 GB. This is when running duperemove git with noblock, but almost nothing changes if I omitt it (it defaults to block). Why did my metadata increase by a 4x factor? 99% of my data already had shared extents because of snapshots, so why such a huge increase? Deduplication didn't finish up to 100%, because duperemove got killed by OOM killer at 99%: https://paste.pound-python.org/show/yUcIOSzXcrfNPkF9rV2L/ As you can see from dmesg (https://paste.pound-python.org/show/eZIkpxUU6QR9ij6Rn1Oq/) there is no process stealing so much memory (my system has 8GB): the biggest one takes as much as 700MB of vm. Another strange thing that you can see from the previous log is that it tries to deduplicate /home/niko/nosnap/rootfs/@images/fedora25.qcow2 which is a UNIQUE file. Such image is stored in a separate subvolume because I don't want it to be snapshotted, so I'm pretty sure there are no other copies of this image, but still it tries to deduplicate it. Niccolò Belli -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kerne
Re: degraded BTRFS RAID 1 not mountable: open_ctree failed, unable to find block group for 0
On giovedì 17 novembre 2016 21:20:56 CET, Austin S. Hemmelgarn wrote: On 2016-11-17 15:05, Chris Murphy wrote: I think the wiki should be updated to reflect that raid1 and raid10 are mostly OK. I think it's grossly misleading to consider either as green/OK when a single degraded read write mount creates single chunks that will then prevent a subsequent degraded read write mount. And also the lack of various notifications of device faultiness I think make it less than OK also. It's not in the "do not use" category but it should be in the middle ground status so users can make informed decisions. It's worth pointing out also regarding this: * This is handled sanely in recent kernels (the check got changed from per-fs to per-chunk, so you still have a usable FS if all the single chunks are only on devices you still have). * This is only an issue with filesystems with exactly two disks. If a 3+ disk raid1 FS goes degraded, you still generate raid1 chunks. * There are a couple of other cases where raid1 mode falls flat on it's face (lots of I/O errors in a short span of time with compression enabled can cause a kernel panic for example). * raid10 has some other issues of it's own (you lose two devices, your filesystem is dead, which shouldn't be the case 100% of the time (if you lose different parts of each mirror, BTRFS _should_ be able to recover, it just doesn't do so right now)). As far as the failed device handling issues, those are a problem with BTRFS in general, not just raid1 and raid10, so I wouldn't count those against raid1 and raid10. Everything you mentioned should be in the wiki IMHO. Knowledge is power. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Announcing btrfs-dedupe
1575 15714810329 220 4 2473 0 konsole [ 6342.147739] [ 1579] 1000 1579 4021 56 13 3 164 0 bash [ 6342.147741] [ 1582] 0 158217563 23 38 3 248 0 sudo [ 6342.147742] [ 1583] 0 1583 3425 53 11 3 118 0 duperemove.sh [ 6342.147744] [ 4060] 0 4060 16850192579 203 3 24 0 duperemove [ 6342.147746] Out of memory: Kill process 4060 (duperemove) score 21 or sacrifice child [ 6342.147754] Killed process 4060 (duperemove) total-vm:674004kB, anon-rss:367672kB, file-rss:2644kB, shmem-rss:0kB Any idea? The process with the highest total_vm is plasmashell, but it has only 900MB of vm. Niccolò Belli -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Announcing btrfs-dedupe
On martedì 15 novembre 2016 18:52:01 CET, Zygo Blaxell wrote: Like I said, millions of extents per week... 64K is an enormous dedup block size, especially if it comes with a 64K alignment constraint as well. These are the top ten duplicate block sizes from a sample of 95251 dedup ops on a medium-sized production server with 4TB of filesystem (about one machine-day of data): Which software do you use to dedupe your data? I tried duperemove but it gets killed by the OOM killer because it triggers some kind of memory leak: https://github.com/markfasheh/duperemove/issues/163 Niccolò Belli -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Announcing btrfs-dedupe
Hi, What do you think about jdupes? I'm searching an alternative to duperemove and rmlint doesn't seem to support btrfs deduplication, so I would like to try jdupes. My main problem with duperemove is a memory leak, also it seems to lead to greater disk usage: https://github.com/markfasheh/duperemove/issues/163 Niccolo' Belli On martedì 8 novembre 2016 23:36:25 CET, Saint Germain wrote: Please be aware of these other similar softwares: - jdupes: https://github.com/jbruchon/jdupes - rmlint: https://github.com/sahib/rmlint And of course fdupes. Some intesting points I have seen in them: - use xxhash to identify potential duplicates (huge speedup) - ability to deduplicate read-only snapshots - identify potential reflinked files (see also my email here: https://www.spinics.net/lists/linux-btrfs/msg60081.html) - ability to filter out hardlinks - triangle problem: see jdupes readme - jdupes has started the process to be included in Debian I hope that will help and that you can share some codes with them ! -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Announcing btrfs-dedupe
On martedì 8 novembre 2016 17:58:52 CET, James Pharaoh wrote: Yes, everything you have described here is something I intend to create, and might as well include in the tool itself. I'll add it to the roadmap ;-) Sounds good, but I have yet another feature request which is even more interesting in my opinion. If you ever used snapper you probably already found yourself in the poisition when you want to free some space and you actually can't, because the files you want to delete are already present in countless snapshots. Such a way you will have to delete the unwanted files from every snapshot, which is tedious task, even more difficult if you moved/renamed these files. What I actually do is exploiting duperemove's hashfile to grep for the checksum and obtain all the paths. Then I will have to switch the snapshots to rw, manually delete each file and finally switch them back to ro. A tool which automates these task would be awesome. Niccolo' -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Announcing btrfs-dedupe
On martedì 8 novembre 2016 12:38:48 CET, James Pharaoh wrote: You can't deduplicate a read-only snapshot, but you can create read-write snapshots from them, deduplicate those, and then recreate the read-only ones. This is what I've done. Since snapper creates hundreds of snapshots, isn't this something that the deduplication software could do for me if I explicitely tell it to do so? I mean momentarily switching the snapshot to rw in order to deduplicate it, then switching it back to ro. In theory, once this has been done once, it shouldn't have to be done again, at least for those snapshots, unless you want to modify the deduplication. It's probably a good idea to defragment files and directories first, as well. I can't defragment anything, because it would take too much free space to do so with so many snapshots. Instead, the deduplication software could defragment each file before calling the extent-same ioctl, that would be feasible. Such a way you will not need hilarious amounts of free space to defragment the fs. It should be possible to deduplicate a read-only file to a read-write one, but that's probably not worth the effort in many real-world use cases. This is exactly what I would expect a deduplication tool to do when it encounters a ro snapshot, except when I explicitely tell it to momentarily switch the snapshot to rw in order to deduplicate it. Niccolo' Belli -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Announcing btrfs-dedupe
Nice, you should probably update the btrfs wiki as well, because there is no mention of btrfs-dedupe. First question, why this name? Don't you plan to support xfs as well? Second question, I'm trying deduplication tools for the very first time and I still have to figure out how to handle snapper snapshots, which are read only. I currently tried duperemove 0.11 git and I get tons of "Error 30: Read-only file system while opening "/.../@snapshots/4385/...". How am I supposed to handle snapper snapshots? I do not run duperemove from a live distro, instead I run it directly on the system I want to deduplicate: sudo mount -o noatime,compress=lzo,autodefrag /dev/mapper/cryptroot /home/niko/nosnap/rootfs/ sudo duperemove -drh --dedupe-options=nofiemap --hashfile=/home/niko/nosnap/rootfs.hash /home/niko/nosnap/rootfs/ Is btrfs-dedupe able to handle snapper snapshots? Thanks, Niccolo' Belli -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Amount of scrubbed data goes from 15.90GiB to 26.66GiB after defragment -r -v -clzo on a fs always mounted with compress=lzo
On venerdì 13 maggio 2016 08:11:27 CEST, Duncan wrote: In theory the various btrfs dedup solutions out there should work as well, while letting you keep the snapshots (at least to the extent they're either writable snapshots so can be reflink modified Unfortunately as you said dedup doesn't work with read-only snapshots (I only use read-only snapshots with snapper) :( Does dedup's dedup-syscall branch (https://github.com/g2p/bedup/tree/wip/dedup-syscall) which uses the new batch deduplication ioctl merged in Linux 3.12 fix this? Unfortunately latest commit is from september :( -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair
On venerdì 13 maggio 2016 13:35:01 CEST, Austin S. Hemmelgarn wrote: The fact that you're getting an OOPS involving core kernel threads (kswapd) is a pretty good indication that either there's a bug elsewhere in the kernel, or that something is wrong with your hardware. it's really difficult to be certain if you don't have a reliable test case though. Talking about reliable test cases, I forgot to say that I definitely found an interesting one. It doesn't lead to OOPS but perhaps something even more interesting. While running countless stress tests I tried running some games to stress the system in different ways. I chosed openmw (an open source engine for Morrowind) and I played it for a while on my second external monitor (while I watched at some monitoring tools on my first monitor). I noticed that after playing a while I *always* lose internet connection (I use an USB3 Gigabit Ethernet adapter). This isn't the only thing which happens: even if the game keeps running flawlessly and the system *seems* to work fine (I can drag windows, open the terminal...) lots of commands simply stall (for example mounting a partition, unmounting it, rebooting...). I can reliably reproduce it, it ALWAYS happens. Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair
On giovedì 12 maggio 2016 17:43:38 CEST, Austin S. Hemmelgarn wrote: That's probably a good indication of the CPU and the MB being OK, but not necessarily the RAM. There's two other possible options for testing the RAM that haven't been mentioned yet though (which I hadn't thought of myself until now): 1. If you have access to Windows, try the Windows Memory Diagnostic. This runs yet another slightly different set of tests from memtest86 and memtest86+, so it may catch issues they don't. You can start this directly on an EFI system by loading /EFI/Microsoft/Boot/MEMTEST.EFI from the EFI system partition. 2. This is a Dell system. If you still have the utility partition which Dell ships all their per-provisioned systems with, that should have a hardware diagnostics tool. I doubt that this will find anything (it's part of their QA procedure AFAICT), but it's probably worth trying, as the memory testing in that uses yet another slightly different implementation of the typical tests. You can usually find this in the boot interrupt menu accessed by hitting F12 before the boot-loader loads. I tried the Dell System Test, including the enhanced optional ram tests and it was fine. I also tried the Microsoft one, which passed. BUT if I select the advanced test in the Microsoft One it always stops at 21% of first test. The test menus are still working, but fans get quiet and it keeps writing "test running... 21%" forever. I tried it many times and it always got stuck at 21%, so I suspect a test suite bug instead of a ram failure. I also noticed some other interesting behaviours: while I was running the usual scrub+check (both were fine) from the livecd I noticed this in dmesg: [ 261.301159] BTRFS info (device dm-0): bdev /dev/mapper/cryptroot errs: wr 0, rd 0, flush 0, corrupt 4, gen 0 Corrupt? But both scrub and check were fine... I double checked scrub and check and they were still fine. This is what happened another time: https://drive.google.com/open?id=0Bwe9Wtc-5xF1dGtPaWhTZ0w5aUU I was making a backup of my partition USING DD from the livecd. It wasn't even mounted if I recall correctly! On giovedì 12 maggio 2016 18:48:17 CEST, Zygo Blaxell wrote: That's what a RAM corruption problem looks like when you run btrfs scrub. Maybe the RAM itself is OK, but *something* is scribbling on it. Does the Arch live usb use the same kernel as your normal system? Yes, except for the point release (the system is slightly ahead of the liveusb). On giovedì 12 maggio 2016 18:48:17 CEST, Zygo Blaxell wrote: Did you try an older (or newer) kernel? I've been running 4.5.x on a few canary systems, but so far none of them have survived more than a day. No (except for point releases from 4.5.0 to 4.5.4), but I will try 4.4. On giovedì 12 maggio 2016 18:48:17 CEST, Zygo Blaxell wrote: It's possible there's a problem that affects only very specific chipsets You seem to have eliminated RAM in isolation, but there could be a problem in the kernel that affects only your chipset. Funny considering it is sold as a Linux laptop. Unfortunately they only tested it with the ancient Ubuntu 14.04. Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Amount of scrubbed data goes from 15.90GiB to 26.66GiB after defragment -r -v -clzo on a fs always mounted with compress=lzo
Thanks for the detailed explanation, hopefully in the future someone will be able to make defrag snapshot/reflink aware in a scalable manner. I will not use use defrag anymore, but what do you suggest me to do to reclaim the lost space? Get rid of my current snapshots or maybe simply running bedup? Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair
On lunedì 9 maggio 2016 18:29:41 CEST, Zygo Blaxell wrote: Did you also check the data matches the backup? btrfs check will only look at the metadata, which is 0.1% of what you've copied. From what you've written, there should be a lot of errors in the data too. If you have incorrect data but btrfs scrub finds no incorrect checksums, then your storage layer is probably fine and we have to look at CPU, host RAM, and software as possible culprits. The logs you've posted so far indicate that bad metadata (e.g. negative item lengths, nonsense transids in metadata references but sane transids in the referred pages) is getting into otherwise valid and well-formed btrfs metadata pages. Since these pages are protected by checksums, the corruption can't be originating in the storage layer--if it was, the pages should be rejected as they are read from disk, before btrfs even looks at them, and the insane transid should be the "found" one not the "expected" one. That suggests there is either RAM corruption happening _after_ the data is read from disk (i.e. while the pages are cached in RAM), or a severe software bug in the kernel you're running. When doing the btrfs check I also always do a btrfs scrub and it never found any error. Once it didn't manage to finish the scrub because of: BTRFS critical (device dm-0): corrupt leaf, slot offset bad: block=670597120,root=1, slot=6 and btrfs scrub status reported "was aborted after 00:00:10". Talking about scrub I created a systemd timer to run scrub hourly and I noticed 2 *uncorrectable* errors suddenly appeared on my system. So I immediately re-run the scrub just to confirm it and then I rebooted into the Arch live usb and runned btrfs check: the metadata were perfect. So I runned btrfs scrub from the live usb and there were no errors at all! I rebooted into my system and runned scrub once again and the uncorrectable errors where really gone! It happened two times in the past few days. Try different kernel versions (e.g. 4.4.9 or 4.1.23) in case whoever maintains your kernel had a bad day and merged a patch they should not have. Almost no patches get applied by the Arch kernel team: https://git.archlinux.org/svntogit/packages.git/tree/trunk?h=packages/linux At the moment the only one is an harmless "change-default-console-loglevel.patch". Try a minimal configuration with as few drivers as possible loaded, especially GPU drivers and anything from the staging subdirectory--when these drivers have bugs, they ruin everything. Arch kernel team is quite conservative regarding staging/experimental features, I remember they rejected some config patches I submitted because of this. Anyway I will try to blacklist as many kernel modules as I can. Maybe blacklisting GPU is too much because if I can't actually use my laptop it will be much more difficult to reproduce the issue. Try memtest86+ which has a few more/different tests than memtest86. I have encountered RAM modules that pass memtest86 but fail memtest86+ and vice versa. Try memtester, a memory tester that runs as a Linux process, so it can detect corruption caused when device drivers spray data randomly into RAM, or when the CPU thermal controls are influenced by Linux (an overheating CPU-to-RAM bridge can really ruin your day, and some of the dumber laptop designs rely on the OS for thermal management). Try running more than one memory testing process, in case there is a bug in your hardware that affects interactions between multiple cores (memtest is single-threaded). You can run memtest86 inside a kvm (e.g. kvm -m 3072 -kernel /boot/memtest86.bin) to detect these kinds of issues. Kernel compiles are a bad way to test RAM. I've successfully built kernels on hosts with known RAM failures. The kernels don't always work properly, but it's quite rare to see a build fail outright. I didn't use memtest86+ because of the lack of EFI support, but I just tried the shiny new memtest86 7.0 beta with improved tests for 12+ hours without issues. Also I runned "memtester 4G" and "systester-cli -gausslg 64M -threads 4 -turns 10" together for 12 hours without any issue so I think both my ram and cpu are ok. I can think only about two possible culprits now (correct me if I'm wrong): 1) A btrfs bug 2) Another module screwing things around I can do nothing about btrfs bugs so I will try to hunt the second option. This is the list of modules I'm running: lsmod | awk '$4 == ""' | awk '{print $1}' | sort 8250_dw ac acpi_als acpi_pad aesni_intel ahci algif_skcipher ansi_cprng arc4 atkbd battery bnep btrfs btusb cdc_ether cmac coretemp crc32c_intel crc32_pclmul crct10dif_pclmul dell_laptop dell_wmi dm_crypt drbg ecb elan_i2c evdev ext4 fan fjes ghash_clmulni_intel gpio_lynxpoint hid_generic hid_multitouch hmac i2c_designware_platform i2c_hid i2c_i801 i915 input_leds int3400_thermal int3402_thermal int3403_thermal intel_hid intel_pch_thermal intel_powerclamp intel_rapl ip_tables
Amount of scrubbed data goes from 15.90GiB to 26.66GiB after defragment -r -v -clzo on a fs always mounted with compress=lzo
Hi, Before doing the daily backup I did a btrfs check and btrfs scrub as usual. After that this time I also decided to run btrfs filesystem defragment -r -v -clzo on all subvolumes (from a live distro) and just to be sure I runned check and scrub once again. Before defragment: total bytes scrubbed: 15.90GiB with 0 errors After defragment: total bytes scrubbed: 26.66GiB with 0 errors What did happen? This is something like a night and day difference: almost double the data! As stated in the subject all the subolumes have always been mounted with compress=lzo in /etc/fstab, even when I installed the distro a couple of days ago I manually mounted the subvolumes with -o compress=lzo. Instead I never used autodefrag. Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair
On domenica 8 maggio 2016 20:27:55 CEST, Patrik Lundquist wrote: Are you using any power management tweaks? Yes, as stated in my very first post I use TLP with SATA_LINKPWR_ON_BAT=max_performance, but I managed to reproduce the bug even without TLP. Also in the past week I've alwyas been on AC. On lunedì 9 maggio 2016 13:52:16 CEST, Austin S. Hemmelgarn wrote: Memtest doesn't replicate typical usage patterns very well. My usual testing for RAM involves not just memtest, but also booting into a LiveCD (usually SystemRescueCD), pulling down a copy of the kernel source, and then running as many concurrent kernel builds as cores, each with as many make jobs as cores (so if you've got a quad core CPU (or a dual core with hyperthreading), it would be running 4 builds with -j4 passed to make). GCC seems to have memory usage patterns that reliably trigger memory errors that aren't caught by memtest, so this generally gives good results. Building kernel with 4 concurrent threads is not an issue for my system, in fact I do compile a lot and I never had any issue. On lunedì 9 maggio 2016 13:52:16 CEST, Austin S. Hemmelgarn wrote: On a similar note, badblocks doesn't replicate filesystem like access patterns, it just runs sequentially through the entire disk. This isn't as likely to give bad results, but it's still important to know. In particular, try running it over a dmcrypt volume a couple of times (preferably with a different key each time, pulling keys from /dev/urandom works well for this), as that will result in writing different data. For what it's worth, when I'm doing initial testing of new disks, I always use ddrescue to copy /dev/zero over the whole disk, then do it twice through dmcrypt with different keys, copying from the disk to /dev/null after each pass. This gives random data on disk as a starting point (which is good if you're going to use dmcrypt), and usually triggers reallocation of any bad sectors as early as possible. While trying to find a common denominator for my issue I did lots of backups of /dev/mapper/cryptroot and I restored them into /dev/mapper/cryptroot dozens of times (triggering a 150GB+ random data write every time), without any issue (after restoring the backup I alwyas check the parition with btrfs check). So disk doesn't seem to be the culprit. On lunedì 9 maggio 2016 13:52:16 CEST, Austin S. Hemmelgarn wrote: 1. If you have an eSATA port, try plugging your hard disk in there and see if things work. If that works but having the hard drive plugged in internally doesn't, then the issue is probably either that specific SATA port (in which case your chip-set is bad and you should get a new system), or the SATA connector itself (or the wiring, but that's not as likely when it's traces on a PCB). Normally I'd suggest just swapping cables and SATA ports, but that's not really possible with a laptop. 2. If you have access to a reasonably large flash drive, or to a USB to SATA adapter, try that as well, if it works on that but not internally (or on an eSATA port), you've probably got a bad SATA controller, and should get a new system. My laptop doesn't have an eSATA port and my only big enough external drive is currently used for daily backups, since I fear for data loss. On lunedì 9 maggio 2016 13:52:16 CEST, Austin S. Hemmelgarn wrote: 3. Try things without dmcrypt. Adding extra layers makes it harder to determine what is actually wrong. If it works without dmcrypt, try using different parameters for the encryption (different ciphers is what I would try first). If it works reliably without dmcrypt, then it's either a bug in dmcrypt (which I don't think is very likely), or it's bad interaction between dmcrypt and BTRFS. If it works with some encryption parameters but not others, then that will help narrow down where the issue is. On domenica 8 maggio 2016 01:35:16 CEST, Chris Murphy wrote: You're making the troubleshooting unnecessarily difficult by continuing to use non-default options. *shrug* Every single layer you add complicates the setup and troubleshooting. Of course all of it should work together, many people do. But you're the one having the problem so in order to demonstrate whether this is a software bug or hardware problem, you need to test it with the most basic setup possible --> btrfs on plain partitions and default mount options. I will try to recap because you obviously missed my previous e-mail: I managed to replicate the irrecoverable corruption bug even with default options and no dmcrypt at all. Somehow it was a bit more difficult to replicate with default options and so I started to play with different combinations to find if there was something which increased the chances of getting corruption. I have the feeling that "autodefrag" enhances the chances to get corruption, but I'm not 100% sure about it. Anyway, triggering a whole packages reinstall with "pacaur -S $(pacman
Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair
Il 2016-05-07 17:58 Clemens Eisserer ha scritto: Hi Niccolo, btrfs + dmcrypt + compress=lzo + autodefrag = corruption at first boot Just to be curious - couldn't it be a hardware issue? I use almost the same setup (compress-force=lzo instead of compress-force=lzo) on my laptop for 2-3 years and haven't experienced any issues since ~kernel-3.14 or so. Br, Clemens Eisserer Hi, Which kind of hardware issue? I did a full memtest86 check, a full smartmontools extended check and even a badblocks -wsv. If this is really an hardware issue that we can identify I would be more than happy because Dell will replace my laptop and this nightmare will be finally over. I'm open to suggestions. Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair
btrfs + dmcrypt + compress=lzo + autodefrag = corruption at first boot So discard is not the culprit. Will try to remove compress=lzo and autodefrag and see if it still happens. [ 748.224346] BTRFS error (device dm-0): memmove bogus src_offset 5431 move len 4294962894 len 16384 [ 748.226206] [ cut here ] [ 748.227831] kernel BUG at fs/btrfs/extent_io.c:5723! [ 748.229498] invalid opcode: [#1] PREEMPT SMP [ 748.231161] Modules linked in: ext4 mbcache jbd2 nls_iso8859_1 nls_cp437 vfat fat snd_hda_codec_hdmi dell_laptop dcdbas dell_wmi iTCO_wdt iTCO_vendor_support intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel arc4 kvm irqbypass psmouse serio_raw pcspkr elan_i2c snd_soc_ssm4567 snd_soc_rt286 snd_soc_rl6347a snd_soc_core i2c_hid iwlmvm snd_compress snd_pcm_dmaengine ac97_bus mac80211 uvcvideo videobuf2_vmalloc btusb videobuf2_memops cdc_ether btrtl usbnet iwlwifi btbcm videobuf2_v4l2 btintel intel_pch_thermal videobuf2_core i2c_i801 videodev r8152 rtsx_pci_ms cfg80211 bluetooth visor media mii memstick joydev evdev mousedev input_leds rfkill mac_hid crc16 i915 fan thermal wmi dw_dmac int3403_thermal video dw_dmac_core drm_kms_helper snd_soc_sst_acpi i2c_designware_platform snd_soc_sst_match [ 748.237203] snd_hda_intel 8250_dw i2c_designware_core gpio_lynxpoint spi_pxa2xx_platform drm int3402_thermal snd_hda_codec battery tpm_crb intel_hid snd_hda_core sparse_keymap fjes snd_hwdep int3400_thermal acpi_thermal_rel tpm_tis snd_pcm intel_gtt tpm acpi_als syscopyarea sysfillrect snd_timer sysimgblt fb_sys_fops mei_me i2c_algo_bit processor_thermal_device kfifo_buf processor snd industrialio acpi_pad ac int340x_thermal_zone mei intel_soc_dts_iosf button lpc_ich soundcore shpchp sch_fq_codel ip_tables x_tables btrfs xor raid6_pq jitterentropy_rng sha256_ssse3 sha256_generic hmac drbg ansi_cprng algif_skcipher af_alg uas usb_storage dm_crypt dm_mod sd_mod rtsx_pci_sdmmc atkbd libps2 crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper [ 748.244176] ablk_helper cryptd ahci libahci libata scsi_mod xhci_pci rtsx_pci xhci_hcd i8042 serio sdhci_acpi sdhci led_class mmc_core pl2303 mos7720 usbserial parport hid_generic usbhid hid usbcore usb_common [ 748.246662] CPU: 0 PID: 2316 Comm: pacman Not tainted 4.5.1-1-ARCH #1 [ 748.249123] Hardware name: Dell Inc. XPS 13 9343/0F5KF3, BIOS A07 11/11/2015 [ 748.251576] task: 8800d9d98e40 ti: 8800cec1 task.ti: 8800cec1 [ 748.254064] RIP: 0010:[] [] memmove_extent_buffer+0x10c/0x110 [btrfs] [ 748.256600] RSP: 0018:8800cec13c18 EFLAGS: 00010246 [ 748.259120] RAX: RBX: 88020c01ba40 RCX: 0056 [ 748.261631] RDX: RSI: 88021e40db38 RDI: 88021e40db38 [ 748.264166] RBP: 8800cec13c48 R08: R09: 033b [ 748.266716] R10: R11: 033b R12: eece [ 748.269267] R13: 00010405 R14: 000104c9 R15: 88020c01ba40 [ 748.271818] FS: 7f14d4271740() GS:88021e40() knlGS: [ 748.274392] CS: 0010 DS: ES: CR0: 80050033 [ 748.276987] CR2: 01630008 CR3: cffc8000 CR4: 003406f0 [ 748.279603] DR0: DR1: DR2: [ 748.282220] DR3: DR6: fffe0ff0 DR7: 0400 [ 748.284815] Stack: [ 748.287422] e3438cd2 88020c01ba40 00c4 002a [ 748.290082] 006b 03a0 8800cec13ce8 a02b612c [ 748.292754] a02b433d 8800da9ca820 0028 8800daa78bd0 [ 748.295441] Call Trace: [ 748.298104] [] btrfs_del_items+0x33c/0x4a0 [btrfs] [ 748.300827] [] ? btrfs_search_slot+0x90d/0x990 [btrfs] [ 748.303564] [] ? btrfs_get_token_8+0x6c/0x130 [btrfs] [ 748.306311] [] btrfs_truncate_inode_items+0x649/0xd20 [btrfs] [ 748.309071] [] ? btrfs_delayed_inode_release_metadata.isra.1+0x4e/0xf0 [btrfs] [ 748.311860] [] btrfs_evict_inode+0x485/0x5d0 [btrfs] [ 748.314627] [] evict+0xc5/0x190 [ 748.317412] [] iput+0x1d9/0x260 [ 748.320199] [] do_unlinkat+0x199/0x2d0 [ 748.322988] [] SyS_unlink+0x16/0x20 [ 748.325781] [] entry_SYSCALL_64_fastpath+0x12/0x6d [ 748.328584] Code: 41 5e 41 5f 5d c3 48 8b 7f 18 48 89 f2 48 c7 c6 40 44 36 a0 e8 06 90 fa ff 0f 0b 48 8b 7f 18 48 c7 c6 08 44 36 a0 e8 f4 8f fa ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 89 fb [ 748.331558] RIP [] memmove_extent_buffer+0x10c/0x110 [btrfs] [ 748.334473] RSP [ 748.356077] ---[ end trace 9bfb28800ab52273 ]--- [ 748.359042] note: pacman[2316] exited with preempt_count 2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: How to rollback a snapshot of a subvolume with nested subvolumes?
Thanks, I hoped there was something like an hidden recursive flag to avoid the tedious task of snapshotting all the nested subvolumes or deleting the nested ones, but it seems there isn't. I usually don't want to keep things like /var/cache/pacman/pkg, but since I'm just doing some tests I didn't want to lose my packages cache. Regarding @/.snapshots it was an unfortunate choise made by snapper and I will definitely create it into the top level, like /srv which shouldn't belong to @. By the way snapper rollbacks are yet another reason to not keep subvolid along with subvol=@ into fstab, like in the one automatically generated by genfstab. Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
How to rollback a snapshot of a subvolume with nested subvolumes?
The following are my subvolumes: $ sudo btrfs subvol list / [sudo] password di niko: ID 257 gen 1040 top level 5 path @ ID 258 gen 1040 top level 5 path @home ID 270 gen 889 top level 257 path var/cache/pacman/pkg ID 271 gen 15 top level 257 path var/abs ID 272 gen 972 top level 257 path var/tmp ID 273 gen 37 top level 257 path tmp ID 274 gen 20 top level 257 path srv ID 276 gen 25 top level 258 path @home/niko/.cache/pacaur ID 280 gen 993 top level 257 path .snapshots ID 281 gen 993 top level 258 path @home/.snapshots ID 282 gen 169 top level 280 path .snapshots/1/snapshot ID 283 gen 171 top level 280 path .snapshots/2/snapshot ID 284 gen 173 top level 280 path .snapshots/3/snapshot ID 285 gen 124 top level 281 path @home/.snapshots/1/snapshot ID 286 gen 175 top level 280 path .snapshots/4/snapshot ID 288 gen 177 top level 280 path .snapshots/5/snapshot ID 290 gen 237 top level 280 path .snapshots/6/snapshot ID 291 gen 238 top level 281 path @home/.snapshots/2/snapshot ID 292 gen 308 top level 280 path .snapshots/7/snapshot ID 293 gen 309 top level 281 path @home/.snapshots/3/snapshot ID 294 gen 376 top level 280 path .snapshots/8/snapshot ID 295 gen 377 top level 281 path @home/.snapshots/4/snapshot ID 296 gen 442 top level 280 path .snapshots/9/snapshot ID 297 gen 443 top level 281 path @home/.snapshots/5/snapshot ID 298 gen 511 top level 280 path .snapshots/10/snapshot ID 299 gen 512 top level 281 path @home/.snapshots/6/snapshot ID 300 gen 578 top level 280 path .snapshots/11/snapshot ID 301 gen 579 top level 281 path @home/.snapshots/7/snapshot ID 302 gen 648 top level 280 path .snapshots/12/snapshot ID 303 gen 649 top level 281 path @home/.snapshots/8/snapshot ID 304 gen 716 top level 280 path .snapshots/13/snapshot ID 305 gen 717 top level 281 path @home/.snapshots/9/snapshot ID 306 gen 967 top level 280 path .snapshots/14/snapshot ID 307 gen 789 top level 281 path @home/.snapshots/10/snapshot ID 309 gen 834 top level 280 path .snapshots/15/snapshot ID 310 gen 874 top level 280 path .snapshots/16/snapshot ID 311 gen 875 top level 281 path @home/.snapshots/11/snapshot ID 312 gen 887 top level 280 path .snapshots/17/snapshot ID 313 gen 888 top level 280 path .snapshots/18/snapshot ID 314 gen 904 top level 280 path .snapshots/19/snapshot ID 316 gen 938 top level 280 path .snapshots/20/snapshot ID 317 gen 939 top level 281 path @home/.snapshots/12/snapshot ID 318 gen 991 top level 280 path .snapshots/21/snapshot ID 319 gen 992 top level 281 path @home/.snapshots/13/snapshot I would like to rollback to .snapshots/14/snapshot, restoring my @ subvolume to such a previous state. So I booted into a livecd, mounted my disk into /mnt and typed: btrfs subvol snapshot /mnt/@/.snapshots/14/snapshot /mnt/@
Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair
I formatted the partition and copied the content of my previous rootfs to it. There is no dmcrypt now and mount options are defaults, except for noatime. After a single boot I got the very same problem as before (fs corrupted and an infinite loop when doing btrfs check --repair. I wanted to replicate results and so I tried once again and since then I only experienced minor corruption, correctly resolved by repair. But during a pacaman upgrade, which triggered snapper pre-post snapshots, the system hanged and I found this in the logs: mag 06 10:31:15 arch-laptop plasmashell[873]: requesting unexisting screen 2 mag 06 10:31:18 arch-laptop dbus[418]: [system] Activating service name='org.opensuse.Snapper' (using servicehelper) mag 06 10:31:18 arch-laptop dbus[418]: [system] Successfully activated service 'org.opensuse.Snapper' mag 06 10:31:20 arch-laptop kernel: [ cut here ] mag 06 10:31:20 arch-laptop kernel: kernel BUG at fs/btrfs/ctree.h:2693! Still no major corruption found since my second attempt. Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair
On giovedì 5 maggio 2016 03:07:37 CEST, Chris Murphy wrote: I suggest using defaults for starters. The only thing in that list that needs be there is either subvolid or subvold, not both. Add in the non-default options once you've proven the defaults are working, and add them one at a time. Yes I read your previous suggestion and I already dropped subvolid, but since the problem already happened I left it in the mail for completeness. Anyway the culprit here is genfstab and that's probably what a beginner is going to use when installing a distro: https://wiki.archlinux.org/index.php/beginners'_guide#fstab Disk is a SAMSUNG SSD PM851 M.2 2280 256GB (Firmware Version: EXT25D0Q). The firmware is old if I understand the naming scheme used by Dell. It says EXT49D0Q is current. http://www.dell.com/support/home/al/en/aldhs1/Drivers/DriversDetails?driverId=0NXHH According to this (http://forum.notebookreview.com/threads/2015-xps-13-ssd-fw-problem-with-m-2-samsung-pm851.770501/) the firmware you linked is for the mSATA version of the drive, not the M.2 one. EXT25D0Q seems to be the very latest one for my drive. I advice using all defaults for everything for now, otherwise it's anyone's guess what you're running into. On giovedì 5 maggio 2016 06:12:28 CEST, Qu Wenruo wrote: Would it be OK for you to test your btrfs on a plain ssd, without encryption? And just as Chris Murphy said, reducing mount option is also a pretty good debugging start point. Ok, I will remove dmcrypt, discard, compress=lzo, nodefrag and see what happens. I made a copy of /dev/mapper/cryptroot with dd on an external drive and I run btrfs check on it (btrfs-progs 4.5.2): https://drive.google.com/open?id=0Bwe9Wtc-5xF1SjJacXpMMU5mems (37MB) Checked, but seems the output is truncated? No, I didn't truncate the btrfs check output because it wasn't endless. I just truncated the repair output. I also have something new to report. Do you remember when I said that my screen was black and so I had to forcedly power off the system? Something similar happened today and since in the meantime I enabled magic sysrq keys I have been able to recover this from the logs: mag 05 11:55:51 arch-laptop kdeinit5[960]: Registering "org.kde.StatusNotifierItem-1060-1/StatusNotifierItem" to system tray mag 05 11:55:51 arch-laptop obexd[1098]: OBEX daemon 5.39 mag 05 11:55:51 arch-laptop dbus-daemon[920]: Successfully activated service 'org.bluez.obex' mag 05 11:55:51 arch-laptop systemd[898]: Started Bluetooth OBEX service. mag 05 11:55:51 arch-laptop korgac[1044]: log_kidentitymanagement: IdentityManager: There was no default identity. Marking first one as default. mag 05 11:55:51 arch-laptop kernel: BUG: unable to handle kernel paging request at 00017d11 mag 05 11:55:51 arch-laptop kernel: IP: [] anon_vma_interval_tree_insert+0x3f/0x90 mag 05 11:55:51 arch-laptop kernel: PGD 0 mag 05 11:55:51 arch-laptop kernel: Oops: [#1] PREEMPT SMP mag 05 11:55:51 arch-laptop kernel: Modules linked in: rfcomm(+) visor bnep uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media btusb btrtl btbcm btintel cdc_ether bluetooth usbnet r8152 crc16 mii joydev mousedev nvr mag 05 11:55:51 arch-laptop kernel: mei_me syscopyarea sysfillrect snd sysimgblt fb_sys_fops i2c_algo_bit shpchp soundcore mei wmi thermal fan intel_hid sparse_keymap int3403_thermal video processor_thermal_device dw_dmac snd_soc_sst_acpi snd_soc_sst_m mag 05 11:55:51 arch-laptop kernel: lrw gf128mul glue_helper ablk_helper cryptd ahci libahci libata scsi_mod xhci_pci rtsx_pci mag 05 11:55:51 arch-laptop kernel: Bluetooth: RFCOMM TTY layer initialized mag 05 11:55:51 arch-laptop kernel: Bluetooth: RFCOMM socket layer initialized mag 05 11:55:51 arch-laptop kernel: Bluetooth: RFCOMM ver 1.11 mag 05 11:55:51 arch-laptop kernel: xhci_hcd mag 05 11:55:51 arch-laptop kernel: i8042 serio sdhci_acpi sdhci led_class mmc_core pl2303 mos7720 usbserial parport hid_generic usbhid hid usbcore usb_common mag 05 11:55:51 arch-laptop kernel: CPU: 0 PID: 351 Comm: systemd-udevd Not tainted 4.5.1-1-ARCH #1 mag 05 11:55:51 arch-laptop kernel: Hardware name: Dell Inc. XPS 13 9343/0F5KF3, BIOS A07 11/11/2015 mag 05 11:55:51 arch-laptop kernel: task: 88021347d580 ti: 880211f8c000 task.ti: 880211f8c000 mag 05 11:55:51 arch-laptop kernel: RIP: 0010:[] [] anon_vma_interval_tree_insert+0x3f/0x90 mag 05 11:55:51 arch-laptop kernel: RSP: 0018:880211f8fd68 EFLAGS: 00010206 mag 05 11:55:51 arch-laptop kernel: RAX: 8800da2f4820 RBX: 8800bb59ce40 RCX: 8800da2f4830 mag 05 11:55:51 arch-laptop kernel: RDX: 8800da2f4828 RSI: 8800374404a0 RDI: 8800c58dfa40 mag 05 11:55:51 arch-laptop kernel: RBP: 880211f8fdb8 R08: 00017c79 R09: 0007f55e2059 mag 05 11:55:51 arch-laptop kernel: R10: 0007f55e2053 R11: 8800c58dfa40 R12: 880037440460 mag 05 11:55:51 arch-laptop kernel:
btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair
I really need your help, because it's the second time btrfs ate my data in a couple of days and I can't use my laptop if I don't find the culprit. This was the mail I sent a couple of days ago: https://www.spinics.net/lists/linux-btrfs/msg54754.html I previously thought the culprit was a bug in kernel 4.6-rc, but I was wrong. Then I reinstalled the whole system (Arch Linux) from scratch, and after just two days I lost some of my data, again. Once again btrfs check --repair got stuck in an infinite loop and I can't repair my fs. The system has always been shutdown properly, except for a single time when I had to forcedly power it off just after the boot because I didn't see any signal on the screen. First the obvious things: - memory is ok (https://drive.google.com/open?id=0Bwe9Wtc-5xF1VnJ0SE9fT1FZMTg) - disk is ok (https://drive.google.com/open?id=0Bwe9Wtc-5xF1NGRhd2daVDRJVGc) - tlp has SATA_LINKPWR_ON_BAT=max_performance (https://drive.google.com/open?id=0Bwe9Wtc-5xF1dFAwUE5ETVpNWGM) - rootfs mount options: rw,noatime,compress=lzo,ssd,discard,space_cache,autodefrag,subvolid=257,subvol=/@ - Command line: BOOT_IMAGE=/@/boot/vmlinuz-linux root=UUID=4fc2278e-f6e8-4a21-8876-cabbf885bb2e rw rootflags=subvol=@ cryptdevice=/dev/disk/by-uuid/c7c8f501-507c-4bd2-a80a-8c7360651f02:cryptroot:allow-discards quiet - scrub didn't find any error: $ sudo btrfs scrub status / scrub status for 4fc2278e-f6e8-4a21-8876-cabbf885bb2e scrub started at Thu May 5 00:57:30 2016 and finished after 00:00:45 total bytes scrubbed: 22.26GiB with 0 errors I have the whole rootfs encrypted, including boot. I followed these steps: https://wiki.archlinux.org/index.php/Dm-crypt/Encrypting_an_entire_system#Btrfs_subvolumes_with_swap Disk is a SAMSUNG SSD PM851 M.2 2280 256GB (Firmware Version: EXT25D0Q). Laptop is a Dell XPS 13 9343 QHD+. Distro is Arch Linux, kernel version is 4.5.1. btrfs-progs is 4.5.2. After two days from the previous data loss I finished reinstalling my distro from scratch, then I decided to do a full backup from a snapshot using tar. This is what I got while trying to backup my data: tar: usr/share/kig/icons/hicolor/32x32/actions/test.png: errore di lettura al byte 0 leggendo 810 byte: Errore di input/output tar: usr/share/kig/icons/hicolor/32x32/actions/circlebpd.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/pointOnLine.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/bezierN.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/convexhull.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/centerofcurvature.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/en.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/circlebps.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/directrix.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/beziercurves.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/segment_midpoint.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/distance.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/circlebcl.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/conicb5p.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/kig_polygon.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/conicasymptotes.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/pointxy.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/attacher.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/coniclineintersection.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/vectorsum.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/rbezier4.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/ellipsebffp.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/angle.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/kig_text.png: funzione "stat" non riuscita: Stale file handle tar: usr/share/kig/icons/hicolor/32x32/actions/vectordifference.png: funzione "stat" non riuscita: Stale file handle tar:
Re: /etc/fstab rootfs options vs grub2 rootflags cmdline
Thanks, Now my fstab option are rw,noatime,compress=lzo,discard,autodefrag,subvolid=257,subvol=/@ I tried to add rootflags=noatime,compress=lzo,discard,autodefrag to GRUB_CMDLINE_LINUX in /etc/default/grub as you suggested but my system didn't manage to boot, probably because grub automatically adds rootflags=subvol=@ and only a single rootflags can be taken into account. Do you have any suggestion? Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
/etc/fstab rootfs options vs grub2 rootflags cmdline
Hi, I have the following options for my rootfs in /etc/fstab: rw,relatime,compress=lzo,ssd,discard,space_cache,autodefrag,subvolid=257,subvol=/@ grub2 already placed rootflags=subvol=@ in its cmdline, but not the other options. I suppose that some of them will automatically be set during remount, but I'm not sure if all of them will. Do you know which ones should I manually add to GRUB_CMDLINE_LINUX in /etc/default/grub? Is there any way to check to if they are already enabled? mount shows /dev/mapper/cryptroot on / type btrfs (rw,relatime,compress=lzo,ssd,discard,space_cache,autodefrag,subvolid=257,subvol=/@) but I'm not sure if I can trust it: I read that space_cache should trigger "enabling free space tree" in dmesg but I can't see it and I don't know about the others. Thanks, Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Uncorrectable errors after rebooting with Magic Sysrq Keys
Finally my external drive arrived and I've been able to make a backup and try btrfs check --repair. Unfortunately btrfs check --repair got stuck in an infinite loop like this one (https://www.spinics.net/lists/linux-btrfs/msg54146.html) and after several hours of looping and several Gigabytes of logs I had to kill it, which gave me a completely fucked fs. I still have backup images, so I can restore the old state and try again with updated tools (I used latest btrfs-progs 4.5.1, but I also tried 4.4.1). For those who didn't read the whole thread I can mount the fs, but it hangs while trying to read certain files and sometimes it remounts read-only. I'm pretty sure the culprit was a bug in 4.6-rc because problems started roughly after upgrading. Disk (an SSD) is fine. The fs is on top of dm-crypt and I always mounted it with "rw,relatime,ssd,space_cache,discard,compress=lzo,autodefrag". You can find the whole logs here: https://drive.google.com/open?id=0Bwe9Wtc-5xF1Z2YwN1Y4U0ROSUU 01_scrub is the scrub output 02_check is the btrfs check output (14MB) 03_repair_short is the btrfs check --repair output truncated to 14MB I hope someone will be able to help me recover my data, otherwise I will have to backup just the most important files and reinstall the whole system from scratch. Mounting the fs and doing a backup with cp -a wasn't a viable solution because it got stuck after several GBs. Niccolò P.S. I changed my spf/dkim/dmarc settings, this email should no longer go into the spam folder, if it does please let me know. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Uncorrectable errors after rebooting with Magic Sysrq Keys
I finally run a btrfs check --readonly on my fs, sorry if it took so long but it complained about the fs being mounted even if it was readonly, so I had to download a Fedora 24 alpha livecd to be able to run it. Here it is (8.5MB): https://drive.google.com/open?id=0Bwe9Wtc-5xF1blJGMTNHaDdUQjg In the meantime, since I suspected it may be a 4.6 regression, I switched back to 4.5. P.S. Scrub's uncorrectable errors went down from 10 to 4 by itself, without any apparent reason. Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Uncorrectable errors after rebooting with Magic Sysrq Keys
https://bpaste.net/show/df9cc097c1da This fs is *completely* FUCKED. Can't wait to get my hands on the external drive to be able to make a full backup. Is it possible it is a kernel 4.6 regression? I had problems before, but nothing like this :( Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Uncorrectable errors after rebooting with Magic Sysrq Keys
Hi, Is it 100% safe to run a btrfs check without --repair? Because otherwise I will have to wait for my new external drive to arrive and make a backup first. Thanks, Niccolò On venerdì 15 aprile 2016 11:30:32 CEST, Qu Wenruo wrote: Would you please run "btrfs check --readonly " and paste the output? The dmesg seems very impossible: BTRFS error (device dm-0): bad tree block start 245497856 245498111 The later one is not even aligned to 2. But you system still seems mountable as you succeeded in running btrfs scrub. So I assume either the tree block is not a critical one or the copy saved you. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Uncorrectable errors after rebooting with Magic Sysrq Keys
Hi, Unfortunately because of buggy upstream support for my hardware (Dell XPS 13 9343) I often have to force reboot using Magic Sysrq Keys (REISUB). In fact I have quite a few hangs, also the majority of times I am not able to shutdown without relying on REISUB. There are obviously times when even REISUB do not work (kernel is completely unresponsive), but the vast majority of times it works. What I do not understand is why Magic Sysrq Keys leave me with a damaged filesystem: shouldn't an emergency SYNC + read only remount be enough to secure my data? After rebooting with REISUB my system often complains about "read only" files and if I "stat" them I get "weird file". I often loose some of my desktop settings like the plasmoids I had on the desktop or my favourite applicatios I had in the menu, but what's even stranger is that I often magically recover them later, while doing exactly NOTHING to recover them. This behaviour scares me so much that I'm thinking about switching to another fs if I will not find a solution very soon. The disk seems fine: https://bpaste.net/show/822d4b4ff902 dmesg: http://paste.pound-python.org/show/wVyHXXOw4emWmWFfVJHQ/ $ sudo btrfs scrub status / [sudo] password di niko: scrub status for 28443ff1-5325-45f6-b879-dad895fcdcfb scrub started at Fri Apr 15 09:38:09 2016 and finished after 00:08:41 total bytes scrubbed: 133.94GiB with 10 errors error details: csum=10 corrected errors: 0, uncorrectable errors: 10, unverified errors: 0 (yesterday there were 4 uncorrectable errors, but after today's reboot with Magic Sysrq Keys it is now 10) Distro is Arch Linux, kernel is 4.6.0-rc3. $ btrfs --version btrfs-progs v4.4.1 Greetings, Niccolò -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html