Re: 6TB partition, Data only 2TB - aka When you haven't hit the "usual" problem
here is the info requested, if that helps anyone. # uname -a Linux SX20S 4.3.0-040300rc7-generic #201510260712 SMP Mon Oct 26 11:27:59 UTC 2015 i686 i686 i686 GNU/Linux # aptitude show btrfs-tools Package: btrfs-tools State: installed Automatically installed: no Version: 4.2.1+ppa1-1~ubuntu15.10.1 # btrfs --version btrfs-progs v4.2.1 # btrfs fi show Media Label: 'Media' uuid: b397b7ef-6754-4ba4-8b1a-fbf235aa1cf8 Total devices 1 FS bytes used 1.92TiB devid1 size 5.46TiB used 1.93TiB path /dev/sdd1 btrfs-progs v4.2.1 # btrfs fi usage Media Overall: Device size: 5.46TiB Device allocated: 1.93TiB Device unallocated: 3.52TiB Device missing: 0.00B Used: 1.93TiB Free (estimated): 3.53TiB (min: 1.76TiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:1.92TiB, Used:1.92TiB /dev/sdd1 1.92TiB Metadata,single: Size:8.00MiB, Used:0.00B /dev/sdd1 8.00MiB Metadata,DUP: Size:5.00GiB, Used:3.32GiB /dev/sdd1 10.00GiB System,single: Size:4.00MiB, Used:0.00B /dev/sdd1 4.00MiB System,DUP: Size:8.00MiB, Used:224.00KiB /dev/sdd1 16.00MiB Unallocated: /dev/sdd1 3.52TiB # btrfs-show-super /dev/sdd1 superblock: bytenr=65536, device=/dev/sdd1 - csum 0xae174f16 [match] bytenr 65536 flags 0x1 ( WRITTEN ) magic _BHRfS_M [match] fsid b397b7ef-6754-4ba4-8b1a-fbf235aa1cf8 label Media generation 11983 root 34340864 sys_array_size 226 chunk_root_generation 11982 root_level 1 chunk_root 21135360 chunk_root_level 1 log_root 0 log_root_transid 0 log_root_level 0 total_bytes 6001173463040 bytes_used 2115339448320 sectorsize 4096 nodesize 16384 leafsize 16384 stripesize 4096 root_dir 6 num_devices 1 compat_flags 0x0 compat_ro_flags 0x0 incompat_flags 0x61 ( MIXED_BACKREF | BIG_METADATA | EXTENDED_IREF ) csum_type 0 csum_size 4 cache_generation 11983 uuid_tree_generation 11983 dev_item.uuid 819e1c8a-5e55-4992-81d3-f22fdd088dc9 dev_item.fsid b397b7ef-6754-4ba4-8b1a-fbf235aa1cf8 [match] dev_item.type 0 dev_item.total_bytes 6001173463040 dev_item.bytes_used 2124972818432 dev_item.io_align 4096 dev_item.io_width 4096 dev_item.sector_size 4096 dev_item.devid 1 dev_item.dev_group 0 dev_item.seek_speed 0 dev_item.bandwidth 0 dev_item.generation 0 I did mount Media -o enospc_debug and now mount shows: /dev/sdd1 on /media/cheater/Media type btrfs (rw,nosuid,nodev,enospc_debug,_netdev) On Wed, Dec 30, 2015 at 11:13 PM, Chris Murphywrote: > kernel and btrfs-progs versions > and output from: > 'btrfs fi show ' > 'btrfs fi usage ' > 'btrfs-show-super ' > 'df -h' > > Then umount the volume, and mount with option enospc_debug, and try to > reproduce the problem, then include everything from dmesg from the > time the volume was mounted. > > -- > Chris Murphy On Sat, Jan 2, 2016 at 3:09 AM, cheater00 . wrote: > I have been unable to reproduce so far. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 6TB partition, Data only 2TB - aka When you haven't hit the "usual" problem
I have been unable to reproduce so far. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub failing
Last thing for tonight, I tried to run btrfs-debug-tree & direct the output to a file. It crashed during the run with the following errors: john@mariposa:~$ sudo btrfs-debug-tree /dev/md125p2 >> /media/data1/btrfs-debug-tree-01012016.txt parent transid verify failed on 519273742336 wanted 1036426 found 1036428 parent transid verify failed on 519273742336 wanted 1036426 found 1036428 parent transid verify failed on 519273742336 wanted 1036426 found 1036428 parent transid verify failed on 519273742336 wanted 1036426 found 1036428 Ignoring transid failure parent transid verify failed on 519271792640 wanted 1036425 found 1036428 parent transid verify failed on 519271792640 wanted 1036425 found 1036428 parent transid verify failed on 519271792640 wanted 1036425 found 1036428 parent transid verify failed on 519271792640 wanted 1036425 found 1036428 Ignoring transid failure parent transid verify failed on 519274119168 wanted 1036426 found 1036428 parent transid verify failed on 519274119168 wanted 1036426 found 1036428 parent transid verify failed on 519274119168 wanted 1036426 found 1036428 parent transid verify failed on 519274119168 wanted 1036426 found 1036428 Ignoring transid failure parent transid verify failed on 519274135552 wanted 1036426 found 1036428 parent transid verify failed on 519274135552 wanted 1036426 found 1036428 parent transid verify failed on 519274135552 wanted 1036426 found 1036428 parent transid verify failed on 519274135552 wanted 1036426 found 1036428 Ignoring transid failure print-tree.c:1108: btrfs_print_tree: Assertion failed. btrfs-debug-tree[0x418d99] btrfs-debug-tree(btrfs_print_tree+0x2c0)[0x41ad4c] btrfs-debug-tree(btrfs_print_tree+0x2dc)[0x41ad68] btrfs-debug-tree(main+0x9a5)[0x432589] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7ff0629ccec5] btrfs-debug-tree[0x4070e9] Hopefully this will help the developers. -John On Fri, Jan 1, 2016 at 11:04 PM, John Centerwrote: > Hi Duncan, > > Doing some more digging, I ran btrfs-image & found the following > errors. I'm not sure how useful this is, or what this means in terms > of the other btrfs-tools messages. Maybe more clues? > > Thanks. > > -John > > john@mariposa:~$ sudo btrfs-image -c9 -t4 /dev/md125p2 > /media/data1/btrfs.image-01012016 > WARNING: The device is mounted. Make sure the filesystem is quiescent. > parent transid verify failed on 337676075008 wanted 1036368 found 1036377 > parent transid verify failed on 337676075008 wanted 1036368 found 1036377 > parent transid verify failed on 337676075008 wanted 1036368 found 1036377 > parent transid verify failed on 337676075008 wanted 1036368 found 1036377 > Ignoring transid failure > parent transid verify failed on 337674846208 wanted 1036370 found 1036377 > parent transid verify failed on 337674846208 wanted 1036370 found 1036377 > parent transid verify failed on 337674846208 wanted 1036370 found 1036377 > parent transid verify failed on 337674846208 wanted 1036370 found 1036377 > Ignoring transid failure > parent transid verify failed on 337675403264 wanted 1036370 found 1036377 > parent transid verify failed on 337675403264 wanted 1036370 found 1036377 > parent transid verify failed on 337675403264 wanted 1036370 found 1036377 > parent transid verify failed on 337675403264 wanted 1036370 found 1036377 > Ignoring transid failure > parent transid verify failed on 337681907712 wanted 1036375 found 1036377 > parent transid verify failed on 337681907712 wanted 1036375 found 1036377 > parent transid verify failed on 337681907712 wanted 1036375 found 1036377 > parent transid verify failed on 337681907712 wanted 1036375 found 1036377 > Ignoring transid failure > parent transid verify failed on 337646354432 wanted 1036368 found 1036377 > parent transid verify failed on 337646354432 wanted 1036368 found 1036377 > parent transid verify failed on 337646354432 wanted 1036368 found 1036377 > parent transid verify failed on 337646354432 wanted 1036368 found 1036377 > Ignoring transid failure > parent transid verify failed on 337679597568 wanted 1036373 found 1036377 > parent transid verify failed on 337679597568 wanted 1036373 found 1036377 > parent transid verify failed on 337679597568 wanted 1036373 found 1036377 > parent transid verify failed on 337679597568 wanted 1036373 found 1036377 > Ignoring transid failure > parent transid verify failed on 337679613952 wanted 1036373 found 1036377 > parent transid verify failed on 337679613952 wanted 1036373 found 1036377 > parent transid verify failed on 337679613952 wanted 1036373 found 1036377 > parent transid verify failed on 337679613952 wanted 1036373 found 1036377 > Ignoring transid failure > parent transid verify failed on 337679745024 wanted 1036372 found 1036377 > parent transid verify failed on 337679745024 wanted 1036372 found 1036377 > parent transid verify failed on 337679745024 wanted 1036372 found 1036377 > parent transid verify failed on 337679745024 wanted 1036372 found
[PATCH] BTRFS: Adds the files and options needed for Hybrid Storage
This patch adds the file required for Hybrid Storage. It contains the memory, time and size limits for the cache and the statistics that will be provided while the cache is operating. It also adds the Makefile changes needed to add the Hybrid Storage. Signed-off-by: Sanidhya Solanki--- fs/btrfs/Makefile | 2 +- fs/btrfs/cache.c | 58 +++ 2 files changed, 59 insertions(+), 1 deletion(-) create mode 100644 fs/btrfs/cache.c diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index 6d1d0b9..dc56ae4 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -9,7 +9,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ export.o tree-log.o free-space-cache.o zlib.o lzo.o \ compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \ reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \ - uuid-tree.o props.o hash.o + uuid-tree.o props.o hash.o cache.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o diff --git a/fs/btrfs/cache.c b/fs/btrfs/cache.c new file mode 100644 index 000..0ece7a1 --- /dev/null +++ b/fs/btrfs/cache.c @@ -0,0 +1,58 @@ +/* + * (c) Sanidhya Solanki, 2016 + * + * Licensed under the FSF's GNU Public License v2 or later. + */ +#include + +/* Cache size configuration )in MiB).*/ +#define MAX_CACHE_SIZE = 1 +#define MIN_CACHE_SIZE = 10 + +/* Time (in seconds)before retrying to increase the cache size.*/ +#define CACHE_RETRY = 10 + +/* Space required to be free (in MiB) before increasing the size of the + * cache. If cache size is less than cache_grow_limit, a block will be freed + * from the cache to allow the cache to continue growning. + */ +#define CACHE_GROW_LIMIT = 100 + +/* Size required to be free (in MiB) after we shrink the cache, so that it + * does not grow in size immediately. + */ +#define CACHE_SHRINK_FREE_SPACE_LIMIT = 100 + +/* Age (in seconds) of oldest and newest block in the cache.*/ +#define MAX_AGE_LIMIT = 300/* Five Minute Rule recommendation, +* optimum size depends on size of data +* blocks. +*/ +#define MIN_AGE_LIMIT = 15 /* In case of cache stampede.*/ + +/* Memory constraints (in percentage) before we stop caching.*/ +#define MIN_MEM_FREE = 10 + +/* Cache statistics. */ +struct cache_stats { + u64 cache_size; + u64 maximum_cache_size_attained; + int cache_hit_rate; + int cache_miss_rate; + u64 cache_evicted; + u64 duplicate_read; + u64 duplicate_write; + int stats_update_interval; +}; + +#define cache_size CACHE_SIZE /* Current cache size.*/ +#define max_cache_size MAX_SIZE /* Max cache limit. */ +#define min_cache_size MIN_SIZE /* Min cache limit.*/ +#define cache_time MAX_TIME /* Maximum time to keep data in cache.*/ +#define evicted_csum EVICTED_CSUM/* Checksum of the evited data +* (to avoid repeatedly caching +* data that was just evicted. +*/ +#define read_csum READ_CSUM /* Checksum of the read data.*/ +#define write_csum WRITE_CSUM /* Checksum of the written data.*/ +#define evict_interval EVICT_INTERVAL /* Time to keep data before eviction.*/ -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Add big device, remove small device, read-only
On Fri, Jan 1, 2016 at 4:47 AM, Rasmus Abrahamsenwrote: > Happy New Year! > > I have a raid with a 1TB, .5TB, 1.5TB and recently added a 4TB and want to > remove the 1.5TB. When saying btrfs dev delete it turned into readonly. I am > on 4.2.5-1-ARCH and btrfs-progs v4.3.1 what can I do? btrfs fi show lsblk -f btrfs fi usage he remove, so I did --since=yesterday > I am looking at the log now, please stnad by. > This is my log > http://pastebin.com/mCPi3y9r What's this? Dec 31 15:03:56 rasmusahome systemd-udevd[6340]: inotify_add_watch(9, /dev/sdd1, 10) failed: No space left on device Dec 31 15:04:01 rasmusahome kernel: sdd: sdd1 Dec 31 15:04:01 rasmusahome systemd-udevd[6341]: inotify_add_watch(9, /dev/sdd1, 10) failed: No space left on device Dec 31 15:05:43 rasmusahome kernel: BTRFS info (device sdb1): disk added /dev/sdd1 Why is udev first complaining about no space left on sdd1, but then it's being added to the btrfs volume? -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs send fail and check hang
Hi, When trying to send a snapshot I'm now getting errors such as: ERROR: failed to open backups/xps13/@home/@home.20151229_13:43:09/alistair/.mozilla/firefox/yu3bxg7y.default/cookies.sqlite. No such file or directory and ERROR: could not find parent subvolume This script has been running without a problem for several weeks. I can reboot the system and the filesystem mounts without a problem. I can also navigate through the existing snapshots and access files without any problem (these are all read-only snapshots, so I'm not attempting to write anything). There are no obvious errors in the system log (I checked the log manually, and also have Marc Merlin's sec script running to monitor for errors). I tried running a read-only btrfs check, however it is hanging while checking fs roots: > sudo umount /srv/d2root > sudo btrfs check /dev/sda checking filesystem on /dev/sda UUID: d8daaa62-afa2-4654-b7de-22fdc8456e03 checking extents checking free space cache checking fs roots ^C Disk IO was several MB/s during the initial part of the check and dropped to 0 on checking fs roots. I left it for about 10 minutes before interrupting. The same happens for /dev/sdb. General system information: uname -a Linux alarmpi 4.1.15-1-ARCH #1 SMP Tue Dec 15 18:39:32 MST 2015 armv7l GNU/Linux btrfs --version btrfs-progs v4.3.1 mount | grep btrfs /dev/sda on /srv/d2root type btrfs (rw,noatime,compress-force=zlib,space_cache) > sudo btrfs fi show /srv/d2root Label: 'data2' uuid: d8daaa62-afa2-4654-b7de-22fdc8456e03 Total devices 2 FS bytes used 117.34GiB devid1 size 1.82TiB used 118.03GiB path /dev/sda devid2 size 1.82TiB used 118.03GiB path /dev/sdb > sudo btrfs fi df /srv/d2root Data, RAID1: total=117.00GiB, used=116.76GiB System, RAID1: total=32.00MiB, used=48.00KiB Metadata, RAID1: total=1.00GiB, used=595.36MiB GlobalReserve, single: total=208.00MiB, used=0.00B > sudo btrfs fi usage /srv/d2root Overall: Device size: 3.64TiB Device allocated:236.06GiB Device unallocated:3.41TiB Device missing: 0.00B Used:234.68GiB Free (estimated): 1.70TiB (min: 1.70TiB) Data ratio: 2.00 Metadata ratio: 2.00 Global reserve: 208.00MiB (used: 0.00B) Data,RAID1: Size:117.00GiB, Used:116.76GiB /dev/sda 117.00GiB /dev/sdb 117.00GiB Metadata,RAID1: Size:1.00GiB, Used:595.36MiB /dev/sda1.00GiB /dev/sdb1.00GiB System,RAID1: Size:32.00MiB, Used:48.00KiB /dev/sda 32.00MiB /dev/sdb 32.00MiB Unallocated: /dev/sda1.70TiB /dev/sdb1.70TiB Help! Many Thanks, Alistair -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub failing
Am Freitag, 1. Januar 2016, 13:20:49 CET schrieb John Center: > > On Jan 1, 2016, at 12:41 PM, Martin Steigerwald> > wrote: > > Am Freitag, 1. Januar 2016, 11:41:20 CET schrieb John Center: […] > >>> On Jan 1, 2016, at 12:55 AM, Duncan <1i5t5.dun...@cox.net> wrote: > >>> > >>> A couple months ago, which would have made it around the 4.2 kernel > >>> you're running (with 4.3 being current and 4.4 nearly out), there were a > >>> number of similar scrub aborted reports on the list. > >> > >> I must have missed that, I'll check the list again to try & understand > >> the > >> issue better. > > > > I had repeatedly failing scrubs as mentioned in another thread here, until > > I used 4.4 kernel. With 4.3 kernel scrub also didn´t work. I didn´t use > > the debug options you used above and I am not sure whether I had this > > scrub issue with 4.2 already, so I am not sure it has been the same > > issue. But you may need to run 4.4 kernel in order to get scrub working > > again. > > > > See my thread "[4.3-rc4] scrubbing aborts before finishing" for details. > > I was afraid of this. I just read your thread. I generally try to stay away > from kernels so new, but I may have to try it. Was there any reason you > didn't go to 4.1 instead? (I run win8.1 in VirtualBox 5.0.12, when I need > to run somethings under Windows. I'd have to wait until 4.4 is released & > supported to do that.) So far 4.4-rc6 is pretty stable for me. And I think its almost before release as rc7 is out already. Reason for not going with 4.1? Ey, that would be downgrading, wouldn´t it? But sure it is also an option. Virtualbox 5.0.12-dfsg-2 as packaged by Debian runs fine here with 4.4-rc6. Thanks, -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Add big device, remove small device, read-only
I accidentically sent my messages directly to Duncan, I am copying them in here. Hello Duncan, Thank you for the amazing response. Wow, you are awesome. Here is the output of fi show, fi df and mount, sorry for not providing them to begin with: http://pastebin.com/DpiuDvRy > On 01 Jan 2016, at 17:39, Duncan <1i5t5.dun...@cox.net> wrote: > > Rasmus Abrahamsen posted on Fri, 01 Jan 2016 12:47:08 +0100 as excerpted: > >> Happy New Year! >> >> I have a raid with a 1TB, .5TB, 1.5TB and recently added a 4TB and want >> to remove the 1.5TB. When saying btrfs dev delete it turned into >> readonly. I am on 4.2.5-1-ARCH and btrfs-progs v4.3.1 what can I do? > > This isn't going to help with the specific problem, and doesn't apply to > your case now anyway as the 4 TB device has already been added so all > you're doing now is deleting the old one, but FWIW... > > There's a fairly new command, btrfs replace, that can be used to directly > replace an old device with a new one, instead of doing btrfs device add, > followed by btrfs device delete/remove. > >> On top of that, my linux is on this same raid, so perhaps btrfs is >> writing some temp files in the filesystem but cannot? >> . /dev/sdc1 on / type btrfs >> (ro,relatime,space_cache,subvolid=1187,subvol=/linux) > > Your wording leaves me somewhat confused. You say your Linux, presumably > your root filesystem, is on the same raid as the filesystem that is > having problems. That would imply that it's a different filesystem, > which in turn would apply that the raid is below the filesystem level, > say mdraid, dmraid, or hardware raid, with both your btrfs root > filesystem, and the separate btrfs with the problems, on the same raid- > based device, presumably partitioned so you can put multiple filesystems > on the same device. > > Which of course would generally mean the two btrfs themselves aren't > raid, unless of course you are using at least one non-btrfs raid as one > device under a btrfs raid. But while implied, that's not really > supported by what you said, which suggests a single btrfs raid > filesystem, instead. In which case, perhaps you meant that this > filesystem contains your root filesystem as well, not just that the raid > contains it. > > Of course, if your post had included the usual btrfs fi show and btrfs fi > df (and btrfs fi usage would be good as well) that the wiki recommends be > posted with such reports, that might make things clearer, but it doesn't, > so we're left guessing... > > But I'm assuming you meant a single multi-device btrfs, not multiple > btrfs that happen to be on the same non-btrfs raid. My root / is a sub volume of my filesystem. I will only be talking about the filesystem named “Fortune”, the one named “Glassbox” is not relevant to this problem. > > Another question the show and df would answer is what btrfs raid mode > you're running. The default for multiple device btrfs is of course raid1 > metadata and single mode data, but you might well have set it up with > data and metadata in the same mode, and/or with raid0/5/6/10 for one or > both data and metadata. You didn't say and didn't provide the btrfs > command output that would show it, so... > >> Ralle: did you do balance before removing? >> >> I did not, but I have experience with it balancing itself upon doing so. >> Upon removing a device, that is. >> I am just not sure how to proceed now that everything is read-only. > > You were correct in that regard. btrfs device remove (or btrfs replace) > trigger balance as part of the process, and balancing after adding a > device, only to have balance trigger again with a delete/remove, is > needless. > > Actually, I suspect the remove-triggered balance ran across a problem it > didn't know how to handle when attempting to move one of the chunks from > the existing device, and that's what put the filesystem in read-only > mode. That's usually what happens when btrfs device remove triggers > problems and people report it, anyway. A balance before the remove would > have simply triggered it then, anyway. > > But what the specific problem is, and what to do about it, remains to be > seen. Having that btrfs fi show and btrfs fi df would be a good start, > letting us know at least what raid type we're dealing with, etc. > >> I hope that you have backups? >> >> I do have backups, but it's on Crashplan, so I would prefer not to have >> to go there. > > That's wise, both him asking and you replying you already have them, but > just want to avoid using them if possible. Waaayyy too many folks > posting here find out the hard way about the admin's first rule of > backups, in simplified form, that if you don't have backups, you are > declaring by your actions that the data not backed up is worth less to > you than the time, resources and hassle required to do those backups, > despite any after-the-fact protests to the contrary. Not being in that >
Re: Unrecoverable fs corruption?
On Fri, 2016-01-01 at 08:13 +, Duncan wrote: > you can also try a read-only scrub OT: I just wondered, would a balance include everything a scrub includes (i.e. read+verify all data and rebuild an errors on different devices / block copies)... of course in addition to also copying all "good" data... and perhaps with the difference, that you don't get that detailed information as in scrub but only the kernel log messages about errors? > In this case, > you'll need to recover from the degraded-mount working device as if > the > second one had entirely failed. > > What I'd do in this case, if you haven't done so already, is that > read- > only btrfs scrub, just to see where you are in terms of corruption on > the > remaining device. I don't think that this is the best order of the steps - at least not when it's about precious data. Doing a scrub at this phase, would just read all data, telling you the status,... but first you should try to copy as much as possible (just in case the remaining good drive fails as well) and *then* do the scrub to see what's actually good or not. Alternatively the first step could be backing up to another drive in the sense of dd-copy (beware of the problem of UUID collisions in btrfs: you MUST make sure here that the kernel doesn't see[0] devices with the same IDs, which is of course the case with dd, unless you write to e.g. an image file and not a device) This has advantages and disadvantages: - btrfs rebuild would only rebuild those block that are actually used... so you need to do less reads from a possibly soon-to-be-dying device - OTOH, you only copy the blocks which btrfs thinks are actually used,... and if later it would turn out that there are filesystem corruptions in these, you don't have any other areas (with possibly older data) where you could try some last-resort-recoveries.. Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature
Re: Btrfs Check - "type mismatch with chunk"
On Fri, 2015-12-25 at 08:06 +, Duncan wrote: > I wasn't personally sure if 4.1 itself was affected or not, but the > wiki > says don't use 4.1.1 as it's broken with this bug, with the quick-fix > in > 4.1.2, so I /think/ 4.1 itself is fine. A scan with a current btrfs > check should tell you for sure. But if you meant 4.1.1 and only > typed > 4.1, then yes, better redo. What exactly was that bug in 4.1.1 mkfs and how would one notice that one suffers from it? I created a number of personal filesystems that I use "productively" and I'm not 100% sure during which version I've created them... :/ Is there some easy way to find out, like a fs creation time stamp?? > Unfortunately, the > btrfs- > convert bug isn't as nailed down, but btrfs-convert has a warning up > on > the wiki anyway, as currently being buggy and not reliable. I hope I don't step on anyone's toes who puts efforts into this, but in all doing respect,... I think the whole convert thing is at least in parts a waste of manpower - or perhaps better said: it would be nice to have, but given the lack of manpower at btrfs development and the numerous areas[0] that would need some urgent and probably lots of care,... having a convert from other fs to btrfs seems like luxury that isn't really needed. People don't choose btrfs because they can easy convert, I'd guess. Either they'll choose it (at some time in the future), because it's the default in distros then,... or they choose it already nowadays because it's already pretty great and has awesome features. The conversion is always a think, which at best works and doesn't make things worse[1],... in practise though it's rather likely to fail than to work, because the convert tools would need to keep up with developments at both side (btrfs and e.g. ext) forever. If people want to change their fs, they should simply copy the data from on to the other. Saves a lot of time for the devs :-) Cheers, Chris. [0] From the really awful things like the UUID collision->corruption issues,... over the pretty serious things like all the missing RAID functionality (just look at the recent reports at the list where even RAID1 seems to be far from production ready, not to talk about 5/6),... and many other bugs (like all the recent error reports about non working scrubs, etc.)... to the really strongly desired wishlist- features [1] It's kinda like the situation when many photographers think it makes sense to convert their XYZ RAW format to DNG, which is IMHO inherently stupid. At best all information from XYZ would be preserved (which is however unlikely) at worst you loose information. And since there are good readers (dcraw) for basically all RAW formats there's really not much need to convert it to DNG. (Which doesn't mean I wouldn't like DNG, but it only makes sense if the camera does it natively.) smime.p7s Description: S/MIME cryptographic signature
Re: btrfs scrub failing
Hi Duncan, On Fri, Jan 1, 2016 at 12:05 PM, Duncan <1i5t5.dun...@cox.net> wrote: > John Center posted on Fri, 01 Jan 2016 11:41:20 -0500 as excerpted: > >> If this doesn't resolve the problem, what would you recommend my next >> steps should be? I've been hesitant to run too many of the btrfs-tools, >> mainly because I don't want to accidentally screw things up & I don't >> always know how to interpret the results. (I ran btrfs-debug-tree, >> hoping something obvious would show up. Big mistake. ) > > LOLed at that debug-tree remark. Been there (with other tools) myself. > > Well, I'm hoping someone who had the problem can confirm whether it's > fixed in current kernels (scrub is one of those userspace commands that's > mostly just a front-end to the kernel code which does the real work, so > kernel version is the important thing for scrub). I'm guessing so, and > that you'll find the problem gone in 4.3. > > We'll cross the not-gone bridge if we get to it, but again, if the other > people who had the similar problem can confirm whether it disappeared for > them with the new kernel, it would help a lot, as there were enough such > reports that if it's the same problem and still there for everyone (which > I doubt as I expect there'd still be way more posts about it if so, but > confirmation's always good), nothing to do but wait for a fix, while if > not, and you still have your problem, then it's a different issue and the > devs will need to work with you on a fix specific to your problem. > Ok, I'm at the next bridge. :-( I upgraded the kernel to 4.4rc7 from the Ubuntu Mainline archive & I just ran the scrub: john@mariposa:~$ sudo /sbin/btrfs scrub start -BdR /dev/md125p2 ERROR: scrubbing /dev/md125p2 failed for device id 1: ret=-1, errno=5 (Input/output error) scrub device /dev/md125p2 (id 1) canceled scrub started at Fri Jan 1 19:38:21 2016 and was aborted after 00:02:34 data_extents_scrubbed: 111031 tree_extents_scrubbed: 104061 data_bytes_scrubbed: 2549907456 tree_bytes_scrubbed: 1704935424 read_errors: 0 csum_errors: 0 verify_errors: 0 no_csum: 1573 csum_discards: 0 super_errors: 0 malloc_errors: 0 uncorrectable_errors: 0 unverified_errors: 0 corrected_errors: 0 last_physical: 4729667584 I checked dmesg & this appeared: [11428.983355] BTRFS error (device md125p2): parent transid verify failed on 241287168 wanted 33554449 found 17 [11431.028399] BTRFS error (device md125p2): parent transid verify failed on 241287168 wanted 33554449 found 17 Where do I go from here? Thanks for your help. -John -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub failing
Hi Duncan, Doing some more digging, I ran btrfs-image & found the following errors. I'm not sure how useful this is, or what this means in terms of the other btrfs-tools messages. Maybe more clues? Thanks. -John john@mariposa:~$ sudo btrfs-image -c9 -t4 /dev/md125p2 /media/data1/btrfs.image-01012016 WARNING: The device is mounted. Make sure the filesystem is quiescent. parent transid verify failed on 337676075008 wanted 1036368 found 1036377 parent transid verify failed on 337676075008 wanted 1036368 found 1036377 parent transid verify failed on 337676075008 wanted 1036368 found 1036377 parent transid verify failed on 337676075008 wanted 1036368 found 1036377 Ignoring transid failure parent transid verify failed on 337674846208 wanted 1036370 found 1036377 parent transid verify failed on 337674846208 wanted 1036370 found 1036377 parent transid verify failed on 337674846208 wanted 1036370 found 1036377 parent transid verify failed on 337674846208 wanted 1036370 found 1036377 Ignoring transid failure parent transid verify failed on 337675403264 wanted 1036370 found 1036377 parent transid verify failed on 337675403264 wanted 1036370 found 1036377 parent transid verify failed on 337675403264 wanted 1036370 found 1036377 parent transid verify failed on 337675403264 wanted 1036370 found 1036377 Ignoring transid failure parent transid verify failed on 337681907712 wanted 1036375 found 1036377 parent transid verify failed on 337681907712 wanted 1036375 found 1036377 parent transid verify failed on 337681907712 wanted 1036375 found 1036377 parent transid verify failed on 337681907712 wanted 1036375 found 1036377 Ignoring transid failure parent transid verify failed on 337646354432 wanted 1036368 found 1036377 parent transid verify failed on 337646354432 wanted 1036368 found 1036377 parent transid verify failed on 337646354432 wanted 1036368 found 1036377 parent transid verify failed on 337646354432 wanted 1036368 found 1036377 Ignoring transid failure parent transid verify failed on 337679597568 wanted 1036373 found 1036377 parent transid verify failed on 337679597568 wanted 1036373 found 1036377 parent transid verify failed on 337679597568 wanted 1036373 found 1036377 parent transid verify failed on 337679597568 wanted 1036373 found 1036377 Ignoring transid failure parent transid verify failed on 337679613952 wanted 1036373 found 1036377 parent transid verify failed on 337679613952 wanted 1036373 found 1036377 parent transid verify failed on 337679613952 wanted 1036373 found 1036377 parent transid verify failed on 337679613952 wanted 1036373 found 1036377 Ignoring transid failure parent transid verify failed on 337679745024 wanted 1036372 found 1036377 parent transid verify failed on 337679745024 wanted 1036372 found 1036377 parent transid verify failed on 337679745024 wanted 1036372 found 1036377 parent transid verify failed on 337679745024 wanted 1036372 found 1036377 Ignoring transid failure parent transid verify failed on 337682022400 wanted 1036369 found 1036377 parent transid verify failed on 337682022400 wanted 1036369 found 1036377 parent transid verify failed on 337682022400 wanted 1036369 found 1036377 parent transid verify failed on 337682022400 wanted 1036369 found 1036377 Ignoring transid failure parent transid verify failed on 337679712256 wanted 1036373 found 1036377 parent transid verify failed on 337679712256 wanted 1036373 found 1036377 parent transid verify failed on 337679712256 wanted 1036373 found 1036377 parent transid verify failed on 337679712256 wanted 1036373 found 1036377 Ignoring transid failure parent transid verify failed on 337647927296 wanted 1036368 found 1036377 parent transid verify failed on 337647927296 wanted 1036368 found 1036377 parent transid verify failed on 337647927296 wanted 1036368 found 1036377 parent transid verify failed on 337647927296 wanted 1036368 found 1036377 Ignoring transid failure parent transid verify failed on 337683251200 wanted 1036372 found 1036377 parent transid verify failed on 337683251200 wanted 1036372 found 1036377 parent transid verify failed on 337683251200 wanted 1036372 found 1036377 parent transid verify failed on 337683251200 wanted 1036372 found 1036377 Ignoring transid failure parent transid verify failed on 337683398656 wanted 1036372 found 1036377 parent transid verify failed on 337683398656 wanted 1036372 found 1036377 parent transid verify failed on 337683398656 wanted 1036372 found 1036377 parent transid verify failed on 337683398656 wanted 1036372 found 1036377 Ignoring transid failure parent transid verify failed on 337682432000 wanted 1036369 found 1036377 parent transid verify failed on 337682432000 wanted 1036369 found 1036377 parent transid verify failed on 337682432000 wanted 1036369 found 1036377 parent transid verify failed on 337682432000 wanted 1036369 found 1036377 Ignoring transid failure parent transid verify failed on 337633558528 wanted 1036368 found 1036377 parent transid verify failed on 337633558528
Re: Btrfs send / receive freeze system?
On Fri, Jan 1, 2016 at 4:51 PM, fugazzi®wrote: > Hi everyone. > It's a few weeks that I converted my root partition into btrfs with three sub- > volumes named boot,root,home. I'm booting with /boot on subvol. > > I'm using btrfs send and receive to make backup of the three snapshotted > subvolumes on a second btrfs formatted drive with three commands like this: > > btrfs send /btrfs-root/snap-root/ | gzip > $BKFOLDER/root.dump.gz > > Sometimes, let say once a week, the system completely freeze (mouse keyboard) > during the send, only solution was the reset button. The freeze happened on > different place every time. Last time happened at 80% of the home send for > example. > > The freeze also happened during a send/receive from an external e-sata drive > (to copy some mp3 using send instead of rsync) to the same internal drive > where the backup are also made. > > The system always run and was stable with XFS/xfsdump. > > Kernel is 4.3.3, btrfs progs are 4.3.1, system is Arch Linux 64 bit, Ram 8Gb > Mainboard Asus striker extreme Nvidia 680i, 8 years old. > > After the crash nothing is shown in the systemd log, it simply freeze. You could try this (maybe you have already) and see if and where the problem is in btrfs: https://en.m.wikipedia.org/wiki/Magic_SysRq_key Instead of pipe to gzip you could do: | btrfs receive -vv and trace back at which file the freeze happens. Maybe that tells something about the source filesystem. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs send clone use case
Chris Murphy posted on Thu, 31 Dec 2015 13:54:42 -0700 as excerpted: > I haven't previously heard of this use case for -c option. > It seems to work (no errors or fs weirdness afterward). > > The gist: > send a snapshot from drive 1 to drive 2; > rw snapshot of the drive 2 copy, > and then make changes to it, > then make an ro snapshot; > now send it back to drive 1 *as an incremental* send. While as you likely know my own use-case doesn't use send/receive, based on previous on-list discussion, I considered this the obvious workaround to the problem of the current send stream format not including enough inheritance metadata to allow send/receive to properly handle a /reverse/ send -p. Where -p works, it's the most efficient method, but due to this lack of send-stream inheritance metadata, it apparently can't work in the reverse case, where the usual receive end is now the send end. But doing -c clones, while not /quite/ as efficient as -p because more metadata is sent, is still far more efficient than doing a full send, and can work in this reverse case where the original send side is now the receive side because it's not as strict as -p, being rather more metadata verbose in place of that strictness, where the -p option would fail due to strictness and lack of appropriate inheritance metadata in the stream format. That -p mode missing inheritance metadata, being effectively just one more item, would still be much more efficient than using -c clones, as the clone format is generally more metadata-verbose in ordered to properly identify per-extent clones, but it's simply not there in the current format. When the send format is eventually version-bumped, this additional metadata item should be included, making send -p work in these reverse-send cases, but they ideally want to do just one more "final" send-stream format bump including all changes they've found to be needed, so they're holding off on the format bump for the moment, so as to be able to include anything else they've overlooked when they do finally do it. That's as I understand the state of send/receive, anyway, being interested in it on-list, but not being a current user. But this usage of -c being almost precisely that "reverse-send" usage, only with an additional change thrown in at the normal receive side before the send, I'd actually have been surprised if it /didn't/ work as you outlined. =:^) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
still kworker at 100% cpu in all of device size allocated with chunks situations with write load
First: Happy New Year to you! Second: Take your time. I know its holidays for many. For me it means I easily have time to follow-up on this. Am Mittwoch, 16. Dezember 2015, 09:20:45 CET schrieb Qu Wenruo: > Chris Mason wrote on 2015/12/15 16:59 -0500: > > On Mon, Dec 14, 2015 at 10:08:16AM +0800, Qu Wenruo wrote: > >> Martin Steigerwald wrote on 2015/12/13 23:35 +0100: > >>> Hi! > >>> > >>> For me it is still not production ready. > >> > >> Yes, this is the *FACT* and not everyone has a good reason to deny it. > >> > >>> Again I ran into: > >>> > >>> btrfs kworker thread uses up 100% of a Sandybridge core for minutes on > >>> random write into big file > >>> https://bugzilla.kernel.org/show_bug.cgi?id=90401 > >> > >> Not sure about guideline for other fs, but it will attract more dev's > >> attention if it can be posted to maillist. > >> > >>> No matter whether SLES 12 uses it as default for root, no matter whether > >>> Fujitsu and Facebook use it: I will not let this onto any customer > >>> machine > >>> without lots and lots of underprovisioning and rigorous free space > >>> monitoring. Actually I will renew my recommendations in my trainings to > >>> be careful with BTRFS. > >>> > >>> From my experience the monitoring would check for: > >>> merkaba:~> btrfs fi show /home > >>> Label: 'home' uuid: […] > >>> > >>> Total devices 2 FS bytes used 156.31GiB > >>> devid1 size 170.00GiB used 164.13GiB path > >>> /dev/mapper/msata-home > >>> devid2 size 170.00GiB used 164.13GiB path > >>> /dev/mapper/sata-home > >>> > >>> If "used" is same as "size" then make big fat alarm. It is not > >>> sufficient for it to happen. It can run for quite some time just fine > >>> without any issues, but I never have seen a kworker thread using 100% > >>> of one core for extended period of time blocking everything else on the > >>> fs without this condition being met.>> > >> And specially advice on the device size from myself: > >> Don't use devices over 100G but less than 500G. > >> Over 100G will leads btrfs to use big chunks, where data chunks can be at > >> most 10G and metadata to be 1G. > >> > >> I have seen a lot of users with about 100~200G device, and hit unbalanced > >> chunk allocation (10G data chunk easily takes the last available space > >> and > >> makes later metadata no where to store) > > > > Maybe we should tune things so the size of the chunk is based on the > > space remaining instead of the total space? > > Submitted such patch before. > David pointed out that such behavior will cause a lot of small > fragmented chunks at last several GB. > Which may make balance behavior not as predictable as before. > > > At least, we can just change the current 10% chunk size limit to 5% to > make such problem less easier to trigger. > It's a simple and easy solution. > > Another cause of the problem is, we understated the chunk size change > for fs at the borderline of big chunk. > > For 99G, its chunk size limit is 1G, and it needs 99 data chunks to > fully cover the fs. > But for 100G, it only needs 10 chunks to covert the fs. > And it need to be 990G to match the number again. > > The sudden drop of chunk number is the root cause. > > So we'd better reconsider both the big chunk size limit and chunk size > limit to find a balanaced solution for it. Did you come to any conclusion here? Is there anything I can change with my home BTRFS filesystem to try to find out what works? Challenge here is that it doesn´t happen under defined circumstances. So far I only know the required condition, but not the sufficient condition for it to happen. Another user run into the issue and reported his findings in the bug report: https://bugzilla.kernel.org/show_bug.cgi?id=90401#c14 Thanks, -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Add big device, remove small device, read-only
Happy New Year! I have a raid with a 1TB, .5TB, 1.5TB and recently added a 4TB and want to remove the 1.5TB. When saying btrfs dev delete it turned into readonly. I am on 4.2.5-1-ARCH and btrfs-progs v4.3.1 what can I do? On top of that, my linux is on this same raid, so perhaps btrfs is writing some temp files in the filesystem but cannot? . /dev/sdc1 on / type btrfs (ro,relatime,space_cache,subvolid=1187,subvol=/linux) Ralle: did you do balance before removing? I did not, but I have experience with it balancing itself upon doing so. Upon removing a device, that is. I am just not sure how to proceed now that everything is read-only. I dunno, but generally with things that can crumble you should make it step-by-step crumble? which, in combination arch + newest bleeding edge kernel should be mandatory I hope that you have backups? I do have backups, but it's on Crashplan, so I would prefer not to have to go there. and do you have any logs? Where would those be? I never understood journalctl journalctl --since=today Hmm, it was actually yesterday that I started the remove, so I did --since=yesterday I am looking at the log now, please stnad by. This is my log http://pastebin.com/mCPi3y9r But I fear that it became read-only before actually writing the error to the filesystem -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Add big device, remove small device, now read-only
Happy NY! I have a raid with a 1TB, .5TB, 1.5TB and recently added a 4TB and want to remove the 1.5TB. When saying btrfs dev delete it turned into readonly. I am on 4.2.5-1-ARCH and btrfs-progs v4.3.1 what can I do? On top of that, my linux is on this same raid, so perhaps btrfs is writing some temp files in the filesystem but cannot? . /dev/sdc1 on / type btrfs (ro,relatime,space_cache,subvolid=1187,subvol=/linux) Ralle: did you do balance before removing? I did not, but I have experience with it balancing itself upon doing so. Upon removing a device, that is. I am just not sure how to proceed now that everything is read-only. I dunno, but generally with things that can crumble you should make it step-by-step crumble? which, in combination arch + newest bleeding edge kernel should be mandatory I hope that you have backups? I do have backups, but it's on Crashplan, so I would prefer not to have to go there. and do you have any logs? Where would those be? I never understood journalctl journalctl --since=today Hmm, it was actually yesterday that I started the remove, so I did --since=yesterday I am looking at the log now, please stnad by. This is my log http://pastebin.com/mCPi3y9r But I fear that it became read-only before actually writing the error to the filesystem -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unrecoverable fs corruption?
Chris Murphy posted on Thu, 31 Dec 2015 18:22:09 -0700 as excerpted: > On Thu, Dec 31, 2015 at 4:36 PM, Alexander Duscheleit >wrote: >> Hello, >> >> I had a power fail today at my home server and after the reboot the >> btrfs RAID1 won't come back up. >> >> When trying to mount one of the 2 disks of the array I get the >> following error: >> [ 4126.316396] BTRFS info (device sdb2): disk space caching is enabled >> [ 4126.316402] BTRFS: has skinny extents [ 4126.337324] BTRFS: failed >> to read chunk tree on sdb2 [ 4126.353027] BTRFS: open_ctree failed > > > Why are you trying to mount only one? What mount options did you use > when you did this? Yes, please. >> btrfs restore -viD seems to find most of the files accessible but since >> I don't have a spare hdd of sufficient size I would have to break the >> array and reformat and use one of the disk as restore target. I'm not >> prepared to do this before I know there is no other way to fix the >> drives since I'm essentially destroying one more chance at saving the >> data. > Anyway, in the meantime, my advice is do not mount either device rw > (together or separately). The less changes you make right now the > better. > > What kernel and btrfs-progs version are you using? Unless you've already tried it (hard to say without the mount options you used above), I'd first try a different tact than C Murphy suggests, falling back to what he suggests if it doesn't work. I suppose he assumes you've already tried this... But first things first, as C Murphy suggests, when you post problems like this, *PLEASE* post kernel and progs userspace versions. Given the rate at which btrfs is still changing, that's pretty critical information. Also, if you're not running the latest or second latest kernel or LTS kernel series and a similar or newer userspace, be prepared to be asked to try a newer version. With the almost released 4.4 set to be an LTS, that means it if you want to try it, or the LTS kernel series 4.1 and 3.18, or the current or previous current kernel series 4.3 or 4.2 (tho with 4.2 not being an LTS updates are ended or close to it, so people on it should be either upgrading to 4.3 or downgrading to 4.1 LTS anyway). And for userspace, a good rule of thumb is whatever the kernel series, a corresponding or newer userspace as well. With that covered... This is a good place to bring in something else CM recommended, but in a slightly different context. If you've read many of my previous posts you're likely to know what I'm about to say. The admin's first rule of backups says, in simplest form[1], that if you don't have a backup, by your actions you're defining the data that would be backed up as not worth the hassle and resources to do that backup. If in that case you lose the data, be happy, as you still saved what you defined by your actions as of /true/ value regardless of any claims to the contrary, the hassle and resourced you would have spent making that backup. =:^) While the rule of backups applies in general, for btrfs it applies even more, because btrfs is still under heavy development and while btrfs is "stabilizING, it's not yet fully stable and mature, so the risk of actually needing to use that backup remains correspondingly higher than it'd ordinarily be. But, you didn't mention having backups, and did mention that you didn't have a spare hdd so would have to break the array to have a place to do a btrfs restore to, which reads very much like you don't have ANY BACKUPS AT ALL!! Of course, in the context of the above backups rule, I guess you understand the implications, that you consider the value of that data essentially throw-away, particularly since you still don't have a backup, despite running a not entirely stable filesystem that puts the data at greater risk than would a fully stable filesystem. Which means no big deal. You've obviously saved the time, hassle and resources necessary to make that backup, which is obviously of more value to you than the data that's not backed up, so the data is obviously of low enough value you can simply blow away the filesystem with a fresh mkfs and start over. =:^) Except... were that the case, you probably wouldn't be posting. Which brings entirely new urgency to what CM said about getting that spare hdd, so you can actually create that backup, and count yourself very lucky if you don't lose your data before you have it backed up, since your previous actions were unfortunately not in accordance with the value you seem to be claiming for the data. OK, the rest of this post is written with the assumption that your claims and your actions regarding the value of the data in question, agree, and that since you're still trying to recover the data, you don't consider it just throw-away, which means you now have someplace to put that backup, should you actually be lucky enough to get the chance to make
how btrfs uses devid?
Dear all: If a btrfs device is missing, the command tool tells user the devid of the missing devices. I understand that each device (disk) in a btrfs volume has been assigned a uuid (UUID_SUB field in udevadm info output). If the device is missing, it's hard to tell user to input such uuid string in command line. So devid is for convenience. In our product, we want to record all disk information of a volume in a file. If a disk is missing, not because it's broken, but because the user has so many disks and in some cases they may put back the wrong one. In this scenario, we can provide the disk information (such as serial number) to user and help them to check if they did something wrong. My question is: is the devid just an alias to sub uuid? for a given disk device, it is never changed during any btrfs operation, including add, remove, balance and replace? or it may be changed, and when? One more question just for curiosity. I checked the source code of btrfs-progs briefly. It seems that there is no data structure in superblock recording all sub-uuids or all devids for the volume, so how does btrfs figure out the missing devid? since they are not always sequential integers, for example, after one device is removed, the devid is simply removed and the devid of other device is not re-numbered. matianfu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
cannot repair filesystem
Hi, if I try to repair filesystem got I'am assert. I use Raid6. Linux dibsi 3.16.0-0.bpo.4-amd64 #1 SMP Debian 3.16.7-ckt4-3~bpo70+1 (2015-02-12) x86_64 GNU/Linux root@dibsi:~/btrfs-progs# btrfs fi show Label: none uuid: 6a2f3936-d0ef-43c0-9815-41e24f2bc21a Total devices 1 FS bytes used 26.63GiB devid1 size 111.70GiB used 49.04GiB path /dev/sdf2 Label: none uuid: 73d4dc77-6ff3-412f-9b0a-0d11458faf32 Total devices 5 FS bytes used 1.17TiB devid1 size 931.51GiB used 420.78GiB path /dev/sdb devid2 size 931.51GiB used 420.78GiB path /dev/sdc devid3 size 931.51GiB used 420.78GiB path /dev/sdd devid4 size 931.51GiB used 420.78GiB path /dev/sde devid5 size 931.51GiB used 420.78GiB path /dev/sda root@dibsi:~/btrfs-progs# btrfs check --repair /dev/disk/by-uuid/73d4dc77-6ff3-412f-9b0a-0d11458faf32 enabling repair mode parent transid verify failed on 2280450637824 wanted 861168 found 860380 parent transid verify failed on 2280450637824 wanted 861168 found 860380 checksum verify failed on 2280450637824 found BF5F5D16 wanted AE725F92 checksum verify failed on 2280450637824 found BF5F5D16 wanted AE725F92 bytenr mismatch, want=2280450637824, have=15938376490240 repair mode will force to clear out log tree, Are you sure? [y/N]: y parent transid verify failed on 2280260939776 wanted 861166 found 860368 parent transid verify failed on 2280260939776 wanted 861166 found 860368 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 bytenr mismatch, want=2280260939776, have=15937230354176 parent transid verify failed on 2280260939776 wanted 861166 found 860368 parent transid verify failed on 2280260939776 wanted 861166 found 860368 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 bytenr mismatch, want=2280260939776, have=15937230354176 parent transid verify failed on 2280260939776 wanted 861166 found 860368 parent transid verify failed on 2280260939776 wanted 861166 found 860368 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 bytenr mismatch, want=2280260939776, have=15937230354176 parent transid verify failed on 2280260939776 wanted 861166 found 860368 parent transid verify failed on 2280260939776 wanted 861166 found 860368 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 bytenr mismatch, want=2280260939776, have=15937230354176 parent transid verify failed on 2280260939776 wanted 861166 found 860368 parent transid verify failed on 2280260939776 wanted 861166 found 860368 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 bytenr mismatch, want=2280260939776, have=15937230354176 parent transid verify failed on 2280260939776 wanted 861166 found 860368 parent transid verify failed on 2280260939776 wanted 861166 found 860368 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 bytenr mismatch, want=2280260939776, have=15937230354176 parent transid verify failed on 2280260939776 wanted 861166 found 860368 parent transid verify failed on 2280260939776 wanted 861166 found 860368 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 bytenr mismatch, want=2280260939776, have=15937230354176 parent transid verify failed on 2280260939776 wanted 861166 found 860368 parent transid verify failed on 2280260939776 wanted 861166 found 860368 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 bytenr mismatch, want=2280260939776, have=15937230354176 parent transid verify failed on 2280260939776 wanted 861166 found 860368 parent transid verify failed on 2280260939776 wanted 861166 found 860368 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 bytenr mismatch, want=2280260939776, have=15937230354176 parent transid verify failed on 2280260939776 wanted 861166 found 860368 parent transid verify failed on 2280260939776 wanted 861166 found 860368 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223 bytenr mismatch, want=2280260939776, have=15937230354176 parent transid verify failed on 2280260939776 wanted 861166 found 860368 parent transid verify failed on 2280260939776 wanted 861166 found 860368 checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
Re: how btrfs uses devid?
On Fri, Jan 01, 2016 at 08:16:28PM +0800, UGlee wrote: > Dear all: > > If a btrfs device is missing, the command tool tells user the devid of > the missing devices. > > I understand that each device (disk) in a btrfs volume has been > assigned a uuid (UUID_SUB field in udevadm info output). If the device > is missing, it's hard to tell user to input such uuid string in > command line. So devid is for convenience. > In our product, we want to record all disk information of a volume in > a file. If a disk is missing, not because it's broken, but because the > user has so many disks and in some cases they may put back the wrong > one. In this scenario, we can provide the disk information (such as > serial number) to user and help them to check if they did something > wrong. > > My question is: is the devid just an alias to sub uuid? for a given > disk device, it is never changed during any btrfs operation, including > add, remove, balance and replace? or it may be changed, and when? Actually, devid is the ID that the FS uses internally in the device tree to identify them. It's not just a convenience -- it's the "official" identifier for the device within the filesystem. > One more question just for curiosity. I checked the source code of > btrfs-progs briefly. It seems that there is no data structure in > superblock recording all sub-uuids or all devids for the volume, so > how does btrfs figure out the missing devid? since they are not always > sequential integers, for example, after one device is removed, the > devid is simply removed and the devid of other device is not > re-numbered. The devices that should be there (identified by devid) are listed in the device tree. If one of those doesn't match up with a currently-known device for that filesystem (as determined by btrfs dev scan), then it's missing. Hugo. -- Hugo Mills | I gave up smoking, drinking and sex once. It was the hugo@... carfax.org.uk | scariest 20 minutes of my life. http://carfax.org.uk/ | PGP: E2AB1DE4 | signature.asc Description: Digital signature
Btrfs send / receive freeze system?
Hi everyone. It's a few weeks that I converted my root partition into btrfs with three sub- volumes named boot,root,home. I'm booting with /boot on subvol. I'm using btrfs send and receive to make backup of the three snapshotted subvolumes on a second btrfs formatted drive with three commands like this: btrfs send /btrfs-root/snap-root/ | gzip > $BKFOLDER/root.dump.gz Sometimes, let say once a week, the system completely freeze (mouse keyboard) during the send, only solution was the reset button. The freeze happened on different place every time. Last time happened at 80% of the home send for example. The freeze also happened during a send/receive from an external e-sata drive (to copy some mp3 using send instead of rsync) to the same internal drive where the backup are also made. The system always run and was stable with XFS/xfsdump. Kernel is 4.3.3, btrfs progs are 4.3.1, system is Arch Linux 64 bit, Ram 8Gb Mainboard Asus striker extreme Nvidia 680i, 8 years old. After the crash nothing is shown in the systemd log, it simply freeze. Thanks, regards, Mario -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Btrfs send / receive freeze system - Addendum
Sorry, I forgot to mention that this freeze is happening after I converted this backup drive to btrfs, before it was XFS. So sending to XFS drive didn't caused the freeze while sending with the same script to btrfs formatted drive freeze the system. Kernel and btrfs progs were the same. Regards. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub failing
Hi Duncan, > On Jan 1, 2016, at 12:05 PM, Duncan <1i5t5.dun...@cox.net> wrote: > > John Center posted on Fri, 01 Jan 2016 11:41:20 -0500 as excerpted: > >> Ok, I'll upgrade to 4.3 & see if that resolves the problem with >> scrubbing. >> I was wondering when I compiled the btrfs-tools if there would be a >> problem with them not being in sync with the major kernel version. > > FWIW newer (or older either, as long as not too old and you don't need > any of the features not in the older, and you're not trying to fix > problems only the newer can deal with) versions of btrfs-progs should be > fine. As a rule of thumb I recommend staying at least current to kernel > version, but that's a rule of thumb, primarily to prevent getting /too/ > old, only. Both the btrfs-progs userspace and the kernel itself are > normally designed to be able to work with both older and newer versions > of the other one. > > So userspace not being in sync with the kernel version shouldn't be a > problem. > Ok, good to know. > Well, I'm hoping someone who had the problem can confirm whether it's > fixed in current kernels (scrub is one of those userspace commands that's > mostly just a front-end to the kernel code which does the real work, so > kernel version is the important thing for scrub). I'm guessing so, and > that you'll find the problem gone in 4.3. > I wasn't aware of this. Good to know. > We'll cross the not-gone bridge if we get to it, but again, if the other > people who had the similar problem can confirm whether it disappeared for > them with the new kernel, it would help a lot, as there were enough such > reports that if it's the same problem and still there for everyone (which > I doubt as I expect there'd still be way more posts about it if so, but > confirmation's always good), nothing to do but wait for a fix, while if > not, and you still have your problem, then it's a different issue and the > devs will need to work with you on a fix specific to your problem. > Ok, understood. Thanks & Happy New Year! -John -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Add big device, remove small device, read-only
Rasmus Abrahamsen posted on Fri, 01 Jan 2016 12:47:08 +0100 as excerpted: > Happy New Year! > > I have a raid with a 1TB, .5TB, 1.5TB and recently added a 4TB and want > to remove the 1.5TB. When saying btrfs dev delete it turned into > readonly. I am on 4.2.5-1-ARCH and btrfs-progs v4.3.1 what can I do? This isn't going to help with the specific problem, and doesn't apply to your case now anyway as the 4 TB device has already been added so all you're doing now is deleting the old one, but FWIW... There's a fairly new command, btrfs replace, that can be used to directly replace an old device with a new one, instead of doing btrfs device add, followed by btrfs device delete/remove. > On top of that, my linux is on this same raid, so perhaps btrfs is > writing some temp files in the filesystem but cannot? > . /dev/sdc1 on / type btrfs > (ro,relatime,space_cache,subvolid=1187,subvol=/linux) Your wording leaves me somewhat confused. You say your Linux, presumably your root filesystem, is on the same raid as the filesystem that is having problems. That would imply that it's a different filesystem, which in turn would apply that the raid is below the filesystem level, say mdraid, dmraid, or hardware raid, with both your btrfs root filesystem, and the separate btrfs with the problems, on the same raid- based device, presumably partitioned so you can put multiple filesystems on the same device. Which of course would generally mean the two btrfs themselves aren't raid, unless of course you are using at least one non-btrfs raid as one device under a btrfs raid. But while implied, that's not really supported by what you said, which suggests a single btrfs raid filesystem, instead. In which case, perhaps you meant that this filesystem contains your root filesystem as well, not just that the raid contains it. Of course, if your post had included the usual btrfs fi show and btrfs fi df (and btrfs fi usage would be good as well) that the wiki recommends be posted with such reports, that might make things clearer, but it doesn't, so we're left guessing... But I'm assuming you meant a single multi-device btrfs, not multiple btrfs that happen to be on the same non-btrfs raid. Another question the show and df would answer is what btrfs raid mode you're running. The default for multiple device btrfs is of course raid1 metadata and single mode data, but you might well have set it up with data and metadata in the same mode, and/or with raid0/5/6/10 for one or both data and metadata. You didn't say and didn't provide the btrfs command output that would show it, so... > Ralle: did you do balance before removing? > > I did not, but I have experience with it balancing itself upon doing so. > Upon removing a device, that is. > I am just not sure how to proceed now that everything is read-only. You were correct in that regard. btrfs device remove (or btrfs replace) trigger balance as part of the process, and balancing after adding a device, only to have balance trigger again with a delete/remove, is needless. Actually, I suspect the remove-triggered balance ran across a problem it didn't know how to handle when attempting to move one of the chunks from the existing device, and that's what put the filesystem in read-only mode. That's usually what happens when btrfs device remove triggers problems and people report it, anyway. A balance before the remove would have simply triggered it then, anyway. But what the specific problem is, and what to do about it, remains to be seen. Having that btrfs fi show and btrfs fi df would be a good start, letting us know at least what raid type we're dealing with, etc. > I hope that you have backups? > > I do have backups, but it's on Crashplan, so I would prefer not to have > to go there. That's wise, both him asking and you replying you already have them, but just want to avoid using them if possible. Waaayyy too many folks posting here find out the hard way about the admin's first rule of backups, in simplified form, that if you don't have backups, you are declaring by your actions that the data not backed up is worth less to you than the time, resources and hassle required to do those backups, despite any after-the-fact protests to the contrary. Not being in that group already puts you well ahead of the game! =:^) > and do you have any logs? > > Where would those be? > I never understood journalctl > > journalctl --since=today > > Hmm, it was actually yesterday that I started the remove, so I did > --since=yesterday I am looking at the log now, please stnad by. > This is my log http://pastebin.com/mCPi3y9r But I fear that it became > read-only before actually writing the error to the filesystem Hmm... Looks like my strategy of having both systemd's journald, and syslog-ng, might pay off. I have journald configured to only do temporary files, which it keeps in /run/log/journal, with /run of course tmpfs. That
Re: btrfs scrub failing
Am Freitag, 1. Januar 2016, 11:41:20 CET schrieb John Center: Happy New Year! > > On Jan 1, 2016, at 12:55 AM, Duncan <1i5t5.dun...@cox.net> wrote: > > > > > > John Center posted on Thu, 31 Dec 2015 11:20:28 -0500 as excerpted: > > > > > >> I run a weekly scrub, using Marc Merlin's btrfs-scrub script. > >> Usually, it completes without a problem, but this week it failed. I ran > >> > >> the scrub manually & it stops shortly: > >> > >> > >> john@mariposa:~$ sudo /sbin/btrfs scrub start -BdR /dev/md124p2 > >> ERROR: scrubbing /dev/md124p2 failed for device id 1: > >> ret=-1, errno=5 (Input/output error) > >> scrub device /dev/md124p2 (id 1) canceled > >> scrub started at Thu Dec 31 00:26:34 2015 > >> and was aborted after 00:01:29 [...] > > > > > > > >> My Ubuntu 14.04 workstation is using the 4.2 kernel (Wily). > >> I'm using btrfs-tools v4.3.1. [...] > > > > > > > > A couple months ago, which would have made it around the 4.2 kernel > > you're running (with 4.3 being current and 4.4 nearly out), there were a > > number of similar scrub aborted reports on the list. > > > > > > I must have missed that, I'll check the list again to try & understand the > issue better. I had repeatedly failing scrubs as mentioned in another thread here, until I used 4.4 kernel. With 4.3 kernel scrub also didn´t work. I didn´t use the debug options you used above and I am not sure whether I had this scrub issue with 4.2 already, so I am not sure it has been the same issue. But you may need to run 4.4 kernel in order to get scrub working again. See my thread "[4.3-rc4] scrubbing aborts before finishing" for details. Thanks, -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub failing
Hi Martin, Happy New Year! > On Jan 1, 2016, at 12:41 PM, Martin Steigerwaldwrote: > > Am Freitag, 1. Januar 2016, 11:41:20 CET schrieb John Center: > > Happy New Year! > >>> On Jan 1, 2016, at 12:55 AM, Duncan <1i5t5.dun...@cox.net> wrote: >>> >>> A couple months ago, which would have made it around the 4.2 kernel >>> you're running (with 4.3 being current and 4.4 nearly out), there were a >>> number of similar scrub aborted reports on the list. >> >> I must have missed that, I'll check the list again to try & understand the >> issue better. > > I had repeatedly failing scrubs as mentioned in another thread here, until I > used 4.4 kernel. With 4.3 kernel scrub also didn´t work. I didn´t use the > debug options you used above and I am not sure whether I had this scrub issue > with 4.2 already, so I am not sure it has been the same issue. But you may > need to run 4.4 kernel in order to get scrub working again. > > See my thread "[4.3-rc4] scrubbing aborts before finishing" for details. > I was afraid of this. I just read your thread. I generally try to stay away from kernels so new, but I may have to try it. Was there any reason you didn't go to 4.1 instead? (I run win8.1 in VirtualBox 5.0.12, when I need to run somethings under Windows. I'd have to wait until 4.4 is released & supported to do that.) Thanks. -John-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Btrfs send / receive freeze system - Addendum
Sorry, I forgot to mention that this freeze is happening after I converted this backup drive to btrfs, before it was XFS. So sending to XFS drive didn't caused the freeze while sending with the same script to btrfs formatted drive freeze the system. Kernel and btrfs progs were the same. Regards. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub failing
Hi Duncan, > On Jan 1, 2016, at 12:55 AM, Duncan <1i5t5.dun...@cox.net> wrote: > > John Center posted on Thu, 31 Dec 2015 11:20:28 -0500 as excerpted: > >> I run a weekly scrub, using Marc Merlin's btrfs-scrub script. >> Usually, it completes without a problem, but this week it failed. I ran >> the scrub manually & it stops shortly: >> >> john@mariposa:~$ sudo /sbin/btrfs scrub start -BdR /dev/md124p2 >> ERROR: scrubbing /dev/md124p2 failed for device id 1: >> ret=-1, errno=5 (Input/output error) >> scrub device /dev/md124p2 (id 1) canceled >> scrub started at Thu Dec 31 00:26:34 2015 >> and was aborted after 00:01:29 [...] > >> My Ubuntu 14.04 workstation is using the 4.2 kernel (Wily). >> I'm using btrfs-tools v4.3.1. [...] > > A couple months ago, which would have made it around the 4.2 kernel > you're running (with 4.3 being current and 4.4 nearly out), there were a > number of similar scrub aborted reports on the list. > I must have missed that, I'll check the list again to try & understand the issue better. > I don't recall seeing any directly related patches, but the reports died > down, whether because everybody having them had reported already, or > because a newer kernel fixed the problem, I'm not sure, as I never had > the problem myself[1]. > > So I'd suggest upgrading to either the current 4.3 kernel or the latest > 4.4-rc, and hopefully the problem will be gone. If I'd had the problem > myself I could tell you for sure whether it went away for me with 4.3, > but as I didn't... > Ok, I'll upgrade to 4.3 & see if that resolves the problem with scrubbing. I was wondering when I compiled the btrfs-tools if there would be a problem with them not being in sync with the major kernel version. If this doesn't resolve the problem, what would you recommend my next steps should be? I've been hesitant to run too many of the btrfs-tools, mainly because I don't want to accidentally screw things up & I don't always know how to interpret the results. (I ran btrfs-debug-tree, hoping something obvious would show up. Big mistake. ) Thanks for your help! -John-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs scrub failing
John Center posted on Fri, 01 Jan 2016 11:41:20 -0500 as excerpted: > Ok, I'll upgrade to 4.3 & see if that resolves the problem with > scrubbing. > I was wondering when I compiled the btrfs-tools if there would be a > problem with them not being in sync with the major kernel version. FWIW newer (or older either, as long as not too old and you don't need any of the features not in the older, and you're not trying to fix problems only the newer can deal with) versions of btrfs-progs should be fine. As a rule of thumb I recommend staying at least current to kernel version, but that's a rule of thumb, primarily to prevent getting /too/ old, only. Both the btrfs-progs userspace and the kernel itself are normally designed to be able to work with both older and newer versions of the other one. So userspace not being in sync with the kernel version shouldn't be a problem. > If this doesn't resolve the problem, what would you recommend my next > steps should be? I've been hesitant to run too many of the btrfs-tools, > mainly because I don't want to accidentally screw things up & I don't > always know how to interpret the results. (I ran btrfs-debug-tree, > hoping something obvious would show up. Big mistake. ) LOLed at that debug-tree remark. Been there (with other tools) myself. Well, I'm hoping someone who had the problem can confirm whether it's fixed in current kernels (scrub is one of those userspace commands that's mostly just a front-end to the kernel code which does the real work, so kernel version is the important thing for scrub). I'm guessing so, and that you'll find the problem gone in 4.3. We'll cross the not-gone bridge if we get to it, but again, if the other people who had the similar problem can confirm whether it disappeared for them with the new kernel, it would help a lot, as there were enough such reports that if it's the same problem and still there for everyone (which I doubt as I expect there'd still be way more posts about it if so, but confirmation's always good), nothing to do but wait for a fix, while if not, and you still have your problem, then it's a different issue and the devs will need to work with you on a fix specific to your problem. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html